Next: Improved copy, Previous: Improved forloop, Up: Answers [Contents][Index]
foreach
The foreach
and foreachq
macros (see Foreach) as
presented earlier each have flaws. First, we will examine and fix the
quadratic behavior of foreachq
:
$ m4 -I examples include(`foreachq.m4') ⇒ traceon(`shift')debugmode(`aq') ⇒ foreachq(`x', ``1', `2', `3', `4'', `x ')dnl ⇒1 error→m4trace: -3- shift(`1', `2', `3', `4') error→m4trace: -2- shift(`1', `2', `3', `4') ⇒2 error→m4trace: -4- shift(`1', `2', `3', `4') error→m4trace: -3- shift(`2', `3', `4') error→m4trace: -3- shift(`1', `2', `3', `4') error→m4trace: -2- shift(`2', `3', `4') ⇒3 error→m4trace: -5- shift(`1', `2', `3', `4') error→m4trace: -4- shift(`2', `3', `4') error→m4trace: -3- shift(`3', `4') error→m4trace: -4- shift(`1', `2', `3', `4') error→m4trace: -3- shift(`2', `3', `4') error→m4trace: -2- shift(`3', `4') ⇒4 error→m4trace: -6- shift(`1', `2', `3', `4') error→m4trace: -5- shift(`2', `3', `4') error→m4trace: -4- shift(`3', `4') error→m4trace: -3- shift(`4')
Each successive iteration was adding more quoted shift
invocations, and the entire list contents were passing through every
iteration. In general, when recursing, it is a good idea to make the
recursion use fewer arguments, rather than adding additional quoted
uses of shift
. By doing so, m4
uses less memory, invokes
fewer macros, is less likely to run into machine limits, and most
importantly, performs faster. The fixed version of foreachq
can
be found in m4-1.4.19/examples/foreachq2.m4:
$ m4 -I examples include(`foreachq2.m4') ⇒ undivert(`foreachq2.m4')dnl ⇒include(`quote.m4')dnl ⇒divert(`-1') ⇒# foreachq(x, `item_1, item_2, ..., item_n', stmt) ⇒# quoted list, improved version ⇒define(`foreachq', `pushdef(`$1')_$0($@)popdef(`$1')') ⇒define(`_arg1q', ``$1'') ⇒define(`_rest', `ifelse(`$#', `1', `', `dquote(shift($@))')') ⇒define(`_foreachq', `ifelse(`$2', `', `', ⇒ `define(`$1', _arg1q($2))$3`'$0(`$1', _rest($2), `$3')')') ⇒divert`'dnl traceon(`shift')debugmode(`aq') ⇒ foreachq(`x', ``1', `2', `3', `4'', `x ')dnl ⇒1 error→m4trace: -3- shift(`1', `2', `3', `4') ⇒2 error→m4trace: -3- shift(`2', `3', `4') ⇒3 error→m4trace: -3- shift(`3', `4') ⇒4
Note that the fixed version calls unquoted helper macros in
_foreachq
to trim elements immediately; those helper macros
in turn must re-supply the layer of quotes lost in the macro invocation.
Contrast the use of _arg1q
, which quotes the first list
element, with _arg1
of the earlier implementation that
returned the first list element directly. Additionally, by calling the
helper method immediately, the ‘defn(`iterator')’ no longer
contains unexpanded macros.
The astute m4 programmer might notice that the solution above still uses
more memory and macro invocations, and thus more time, than strictly
necessary. Note that ‘$2’, which contains an arbitrarily long
quoted list, is expanded and rescanned three times per iteration of
_foreachq
. Furthermore, every iteration of the algorithm
effectively unboxes then reboxes the list, which costs a couple of macro
invocations. It is possible to rewrite the algorithm for a bit more
speed by swapping the order of the arguments to _foreachq
in
order to operate on an unboxed list in the first place, and by using the
fixed-length ‘$#’ instead of an arbitrary length list as the key to
end recursion. The result is an overhead of six macro invocations per
loop (excluding any macros in text), instead of eight. This
alternative approach is available as
m4-1.4.19/examples/foreach3.m4:
$ m4 -I examples include(`foreachq3.m4') ⇒ undivert(`foreachq3.m4')dnl ⇒divert(`-1') ⇒# foreachq(x, `item_1, item_2, ..., item_n', stmt) ⇒# quoted list, alternate improved version ⇒define(`foreachq', `ifelse(`$2', `', `', ⇒ `pushdef(`$1')_$0(`$1', `$3', `', $2)popdef(`$1')')') ⇒define(`_foreachq', `ifelse(`$#', `3', `', ⇒ `define(`$1', `$4')$2`'$0(`$1', `$2', ⇒ shift(shift(shift($@))))')') ⇒divert`'dnl traceon(`shift')debugmode(`aq') ⇒ foreachq(`x', ``1', `2', `3', `4'', `x ')dnl ⇒1 error→m4trace: -4- shift(`x', `x error→', `', `1', `2', `3', `4') error→m4trace: -3- shift(`x error→', `', `1', `2', `3', `4') error→m4trace: -2- shift(`', `1', `2', `3', `4') ⇒2 error→m4trace: -4- shift(`x', `x error→', `1', `2', `3', `4') error→m4trace: -3- shift(`x error→', `1', `2', `3', `4') error→m4trace: -2- shift(`1', `2', `3', `4') ⇒3 error→m4trace: -4- shift(`x', `x error→', `2', `3', `4') error→m4trace: -3- shift(`x error→', `2', `3', `4') error→m4trace: -2- shift(`2', `3', `4') ⇒4 error→m4trace: -4- shift(`x', `x error→', `3', `4') error→m4trace: -3- shift(`x error→', `3', `4') error→m4trace: -2- shift(`3', `4')
In the current version of M4, every instance of ‘$@’ is rescanned
as it is encountered. Thus, the foreachq3.m4 alternative uses
much less memory than foreachq2.m4, and executes as much as 10%
faster, since each iteration encounters fewer ‘$@’. However, the
implementation of rescanning every byte in ‘$@’ is quadratic in
the number of bytes scanned (for example, making the broken version in
foreachq.m4 cubic, rather than quadratic, in behavior). A future
release of M4 will improve the underlying implementation by reusing
results of previous scans, so that both styles of foreachq
can
become linear in the number of bytes scanned. Notice how the
implementation injects an empty argument prior to expanding ‘$2’
within foreachq
; the helper macro _foreachq
then ignores
the third argument altogether, and ends recursion when there are three
arguments left because there was nothing left to pass through
shift
. Thus, each iteration only needs one ifelse
, rather
than the two conditionals used in the version from foreachq2.m4.
So far, all of the implementations of foreachq
presented have
been quadratic with M4 1.4.x. But forloop
is linear, because
each iteration parses a constant amount of arguments. So, it is
possible to design a variant that uses forloop
to do the
iteration, then uses ‘$@’ only once at the end, giving a linear
result even with older M4 implementations. This implementation relies
on the GNU extension that ‘$10’ expands to the tenth
argument rather than the first argument concatenated with ‘0’. The
trick is to define an intermediate macro that repeats the text
m4_define(`$1', `$n')$2`'
, with ‘n’ set to successive
integers corresponding to each argument. The helper macro
_foreachq_
is needed in order to generate the literal sequences
such as ‘$1’ into the intermediate macro, rather than expanding
them as the arguments of _foreachq
. With this approach, no
shift
calls are even needed! Even though there are seven macros
of overhead per iteration instead of six in foreachq3.m4, the
linear scaling is apparent at relatively small list sizes. However,
this approach will need adjustment when a future version of M4 follows
POSIX by no longer treating ‘$10’ as the tenth argument;
the anticipation is that ‘${10}’ can be used instead, although
that alternative syntax is not yet supported.
$ m4 -I examples include(`foreachq4.m4') ⇒ undivert(`foreachq4.m4')dnl ⇒include(`forloop2.m4')dnl ⇒divert(`-1') ⇒# foreachq(x, `item_1, item_2, ..., item_n', stmt) ⇒# quoted list, version based on forloop ⇒define(`foreachq', ⇒`ifelse(`$2', `', `', `_$0(`$1', `$3', $2)')') ⇒define(`_foreachq', ⇒`pushdef(`$1', forloop(`$1', `3', `$#', ⇒ `$0_(`1', `2', indir(`$1'))')`popdef( ⇒ `$1')')indir(`$1', $@)') ⇒define(`_foreachq_', ⇒``define(`$$1', `$$3')$$2`''') ⇒divert`'dnl traceon(`shift')debugmode(`aq') ⇒ foreachq(`x', ``1', `2', `3', `4'', `x ')dnl ⇒1 ⇒2 ⇒3 ⇒4
For yet another approach, the improved version of foreach
,
available in m4-1.4.19/examples/foreach2.m4, simply
overquotes the arguments to _foreach
to begin with, using
dquote_elt
. Then _foreach
can just use
_arg1
to remove the extra layer of quoting that was added up
front:
$ m4 -I examples include(`foreach2.m4') ⇒ undivert(`foreach2.m4')dnl ⇒include(`quote.m4')dnl ⇒divert(`-1') ⇒# foreach(x, (item_1, item_2, ..., item_n), stmt) ⇒# parenthesized list, improved version ⇒define(`foreach', `pushdef(`$1')_$0(`$1', ⇒ (dquote(dquote_elt$2)), `$3')popdef(`$1')') ⇒define(`_arg1', `$1') ⇒define(`_foreach', `ifelse(`$2', `(`')', `', ⇒ `define(`$1', _arg1$2)$3`'$0(`$1', (dquote(shift$2)), `$3')')') ⇒divert`'dnl traceon(`shift')debugmode(`aq') ⇒ foreach(`x', `(`1', `2', `3', `4')', `x ')dnl error→m4trace: -4- shift(`1', `2', `3', `4') error→m4trace: -4- shift(`2', `3', `4') error→m4trace: -4- shift(`3', `4') ⇒1 error→m4trace: -3- shift(``1'', ``2'', ``3'', ``4'') ⇒2 error→m4trace: -3- shift(``2'', ``3'', ``4'') ⇒3 error→m4trace: -3- shift(``3'', ``4'') ⇒4 error→m4trace: -3- shift(``4'')
It is likewise possible to write a variant of foreach
that
performs in linear time on M4 1.4.x; the easiest method is probably
writing a version of foreach
that unboxes its list, then invokes
_foreachq
as previously defined in foreachq4.m4.
In summary, recursion over list elements is trickier than it appeared at
first glance, but provides a powerful idiom within m4
processing.
As a final demonstration, both list styles are now able to handle
several scenarios that would wreak havoc on one or both of the original
implementations. This points out one other difference between the
list styles. foreach
evaluates unquoted list elements only once,
in preparation for calling _foreach
, similary for
foreachq
as provided by foreachq3.m4 or
foreachq4.m4. But
foreachq
, as provided by foreachq2.m4,
evaluates unquoted list elements twice while visiting the first list
element, once in _arg1q
and once in _rest
. When
deciding which list style to use, one must take into account whether
repeating the side effects of unquoted list elements will have any
detrimental effects.
$ m4 -I examples include(`foreach2.m4') ⇒ include(`foreachq2.m4') ⇒ dnl 0-element list: foreach(`x', `', `<x>') / foreachq(`x', `', `<x>') ⇒ / dnl 1-element list of empty element foreach(`x', `()', `<x>') / foreachq(`x', ``'', `<x>') ⇒<> / <> dnl 2-element list of empty elements foreach(`x', `(`',`')', `<x>') / foreachq(`x', ``',`'', `<x>') ⇒<><> / <><> dnl 1-element list of a comma foreach(`x', `(`,')', `<x>') / foreachq(`x', ``,'', `<x>') ⇒<,> / <,> dnl 2-element list of unbalanced parentheses foreach(`x', `(`(', `)')', `<x>') / foreachq(`x', ``(', `)'', `<x>') ⇒<(><)> / <(><)> define(`ab', `oops')dnl using defn(`iterator') foreach(`x', `(`a', `b')', `defn(`x')') /dnl foreachq(`x', ``a', `b'', `defn(`x')') ⇒ab / ab define(`active', `ACT, IVE') ⇒ traceon(`active') ⇒ dnl list of unquoted macros; expansion occurs before recursion foreach(`x', `(active, active)', `<x> ')dnl error→m4trace: -4- active -> `ACT, IVE' error→m4trace: -4- active -> `ACT, IVE' ⇒<ACT> ⇒<IVE> ⇒<ACT> ⇒<IVE> foreachq(`x', `active, active', `<x> ')dnl error→m4trace: -3- active -> `ACT, IVE' error→m4trace: -3- active -> `ACT, IVE' ⇒<ACT> error→m4trace: -3- active -> `ACT, IVE' error→m4trace: -3- active -> `ACT, IVE' ⇒<IVE> ⇒<ACT> ⇒<IVE> dnl list of quoted macros; expansion occurs during recursion foreach(`x', `(`active', `active')', `<x> ')dnl error→m4trace: -1- active -> `ACT, IVE' ⇒<ACT, IVE> error→m4trace: -1- active -> `ACT, IVE' ⇒<ACT, IVE> foreachq(`x', ``active', `active'', `<x> ')dnl error→m4trace: -1- active -> `ACT, IVE' ⇒<ACT, IVE> error→m4trace: -1- active -> `ACT, IVE' ⇒<ACT, IVE> dnl list of double-quoted macro names; no expansion foreach(`x', `(``active'', ``active'')', `<x> ')dnl ⇒<active> ⇒<active> foreachq(`x', ```active'', ``active''', `<x> ')dnl ⇒<active> ⇒<active>
Next: Improved copy, Previous: Improved forloop, Up: Answers [Contents][Index]