Skip to content

Cranelift: aarch64: user-controlled recursion in lower_fmla #12368

@mmcloughlin

Description

@mmcloughlin

The following lower_fmla rules enable user-controlled recursion.

;; Special case: if one of the multiplicands is `fneg` then peel that away,
;; reverse the operation being performed, and then recurse on `lower_fmla`
;; again to generate the actual instruction.
;;
;; Note that these are the highest priority cases for `lower_fmla` to peel
;; away as many `fneg` operations as possible.
(rule 5 (lower_fmla op (fneg x) y z size)
        (lower_fmla (neg_fmla op) x y z size))
(rule 6 (lower_fmla op x (fneg y) z size)
        (lower_fmla (neg_fmla op) x y z size))

https://github.com/bytecodealliance/wasmtime/blob/v40.0.2/cranelift/codegen/src/isa/aarch64/lower.isle#L606-L615

These rules will peel away an arbitrary number of fneg arguments to an fma
instruction, alternating between a fused-multiply-add and
fused-multiply-subtract operation.

.clif Test Case

Generate a CLIF function with a stack of fneg arguments using a script like
fneg.py:

import sys


def generate_fneg_fma_rec(count):
    print("function %fma_fneg_rec(f32x4, f32x4, f32x4) -> f32x4 {")
    print("block0(v1: f32x4, v2: f32x4, v3: f32x4):")

    n = 4
    for _ in range(count):
        print(f"    v{n} = fneg v{n - 2}")
        print(f"    v{n + 1} = fneg v{n - 1}")
        n += 2

    print(f"    v{n} = fma v{n - 2}, v{n - 1}, v1")
    print(f"    return v{n}")
    print("}")


def main():
    count = int(sys.argv[1]) if len(sys.argv) > 1 else 1
    generate_fneg_fma_rec(count)


if __name__ == "__main__":
    main()

Steps to Reproduce

Generate a large instance:

python3 fneg.py 100000 >fneg100000.clif

Compile with clif-util (at v40.0.2):

cargo run --bin clif-util -- compile --target aarch64 -p --disasm fneg100000.clif

Expected Results

Expect function to compile and execute successfully. Ideally, it would be
optimized to a single fma.

Actual Results

Observe that rule recursion leads to stack overflow, for a sufficiently large
instance.

thread 'main' (2897042) has overflowed its stack
fatal runtime error: stack overflow, aborting
fish: Job 1, 'cargo run --bin clif-util -- co…' terminated by signal SIGABRT (Abort)

Versions and Environment

Cranelift version or commit: v40.0.2

Operating system: Mac OSX

Architecture: AArch64

Extra Info

Related #12333

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugIncorrect behavior in the current implementation that needs fixingcraneliftIssues related to the Cranelift code generator

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions