Root Cause Analysis of Maglev Graph Builder Bug or Vulnerability?!

8 min readDec 29, 2023

@darkfloyd1014 and @wwkenwong

Introduction

Maglev is a new optimizing compiler and introduced in Chrome M117. For details of its background, performance (compared with Sparkplug and Turbofan), and its Static Single Assignment-Based Approach, you can reference to this blog published in early December this year. We focus on the root cause analysis of the vulnerability discovered by the fuzzer.

Fuzzer

With Fuzzilli developed and maintained by Samuel Gross and Carl Smith from Google V8 Security, we have started our journey of fuzzing vulnerabilities of Chrome V8 through modifying the fuzzer including tuning the probabilities of generating different function calls, studying published vulnerabilities and reading the source code, and creating the templates.

Discovery

We have checked out the V8 version with hash (a8275630c081e7842e05ae7fecc4d3e85888a4e6) and identify the following Proof of Concept (PoC) and crash information as below. From the debug check failed message, it shows label->predecessor_count_ > 1 with the fatal error in ../../src/maglev/maglev-graph-builder.cc line 515.

function f0(a1, a2, a3, a4) {
    return a3;
}
const v6 = Array();
function f7(a8) {
    a8[1073741824] = a8;
    return f7;
}
const t8 = f7(v6);
t8(f0);
%OptimizeMaglevOnNextCall(f7);
f7(v6);

// CRASH INFO
// ==========
// TERMSIG: 6
// STDERR:
// #
// # Fatal error in ../../src/maglev/maglev-graph-builder.cc, line 515
// # Debug check failed: label->predecessor_count_ > 1 (1 vs. 1).
// #
// #
// #
// #FailureMessage Object: 0x7fff334abbd0
// ==== C stack trace ===============================
// 
//     /home/fuzz/Fuzzing_v8/v8/out/fuzzbuild/d8(+0x8958f2) [0x55cc4897e8f2]
//     /home/fuzz/Fuzzing_v8/v8/out/fuzzbuild/d8(+0x894137) [0x55cc4897d137]
//     /home/fuzz/Fuzzing_v8/v8/out/fuzzbuild/d8(+0x88656f) [0x55cc4896f56f]
//     /home/fuzz/Fuzzing_v8/v8/out/fuzzbuild/d8(+0x885f25) [0x55cc4896ef25]
//     /home/fuzz/Fuzzing_v8/v8/out/fuzzbuild/d8(+0x2120067) [0x55cc4a209067]
//     /home/fuzz/Fuzzing_v8/v8/out/fuzzbuild/d8(+0x21dcceb) [0x55cc4a2c5ceb]
//     /home/fuzz/Fuzzing_v8/v8/out/fuzzbuild/d8(+0x214cc9c) [0x55cc4a235c9c]
//     /home/fuzz/Fuzzing_v8/v8/out/fuzzbuild/d8(+0x1f39ae0) [0x55cc4a022ae0]
//     /home/fuzz/Fuzzing_v8/v8/out/fuzzbuild/d8(+0x1f355df) [0x55cc4a01e5df]
//     /home/fuzz/Fuzzing_v8/v8/out/fuzzbuild/d8(+0x1f315f0) [0x55cc4a01a5f0]
//     /home/fuzz/Fuzzing_v8/v8/out/fuzzbuild/d8(+0x1f2fc9f) [0x55cc4a018c9f]
//     /home/fuzz/Fuzzing_v8/v8/out/fuzzbuild/d8(+0x1f28579) [0x55cc4a011579]
//     /home/fuzz/Fuzzing_v8/v8/out/fuzzbuild/d8(+0xb6eb22) [0x55cc48c57b22]
//     /home/fuzz/Fuzzing_v8/v8/out/fuzzbuild/d8(+0xb9bbe1) [0x55cc48c84be1]
//     /home/fuzz/Fuzzing_v8/v8/out/fuzzbuild/d8(+0xb823c2) [0x55cc48c6b3c2]
//     /home/fuzz/Fuzzing_v8/v8/out/fuzzbuild/d8(+0xb84bf1) [0x55cc48c6dbf1]
//     /home/fuzz/Fuzzing_v8/v8/out/fuzzbuild/d8(+0x1d7e9e3) [0x55cc49e679e3]
//     /home/fuzz/Fuzzing_v8/v8/out/fuzzbuild/d8(+0x1d7e21c) [0x55cc49e6721c]
//     /home/fuzz/Fuzzing_v8/v8/out/fuzzbuild/d8(+0x3f36176) [0x55cc4c01f176]
// Received signal 6
// STDOUT:
// [COV] edge counters initialized. Shared memory: shm_id_103962_2 with 1404120 edges
// FUZZER ARGS: ../.build/x86_64-unknown-linux-gnu/release/FuzzilliCli --profile=v8 /home/fuzz/Fuzzing_v8/v8/out/fuzzbuild/d8 --storagePath=/home/fuzz/v8_crash_maglev_master_20231216 --engine=hybrid --jobs=5 --resume
// TARGET ARGS: /home/fuzz/Fuzzing_v8/v8/out/fuzzbuild/d8 --expose-gc --omit-quit --allow-natives-syntax --fuzzing --jit-fuzzing --future --harmony --no-turbofan --no-turboshaft
// CONTRIBUTORS: ExplorationMutator, FixupMutator
// EXECUTION TIME: 26ms

It is difficult to decide what has happened and the root cause of the vulnerability, we move ahead to debug with GDB.

MaglevGraphBuilder

Before that, we supplement the background of MaglevGraphBuilder. The MaglevGraphBuilder is responsible for:

1 . Translating bytecode into a graph of nodes that represents the program’s operations.
2. Performing optimizations on this graph to improve the runtime performance of the code.
3. Potentially handling things like control flow, variable lifetimes, and state tracking across different paths of the code.
In summary, maglev-graph-builder.h contains the declarations and definitions for classes and utilities that are part of V8’s Maglev JIT compiler. These components help in building and optimizing an internal graph representation of JavaScript code for fast execution.

As the below screenshot, it is found that at #5 and #6, it calls ReducePredecessorCount function and cause the fatal error afterwards at #3.

[#5] 0x555557674067 → v8::internal::maglev::MaglevGraphBuilder::MaglevSubGraphBuilder::ReducePredecessorCount(v8::internal::maglev::MaglevGraphBuilder::MaglevSubGraphBuilder::Label*)()
[#6] 0x5555577312b7 → v8::internal::maglev::ReduceResult v8::internal::maglev::MaglevGraphBuilder::TryBuildPolymorphicElementAccess<v8::internal::maglev::MaglevGraphBuilder::VisitSetKeyedProperty()::$_0&>(v8::internal::maglev::ValueNode*, v8::internal::maglev::ValueNode*, v8::internal::compiler::KeyedAccessMode const&, v8::internal::ZoneVector<v8::internal::compiler::ElementAccessInfo> const&, v8::internal::maglev::MaglevGraphBuilder::VisitSetKeyedProperty()::$_0&)()

What is the predecessor count? How should it be calculated?

From the Maglev-Graph-Builder.h, here is the CalculatePredecessorCounts function:

void CalculatePredecessorCounts() {
    // Add 1 after the end of the bytecode so we can always write to the offset
    // after the last bytecode.
    size_t array_length = bytecode().length() + 1;
    predecessors_ = zone()->AllocateArray<uint32_t>(array_length);
    MemsetUint32(predecessors_, 0, entrypoint_);
    MemsetUint32(predecessors_ + entrypoint_, 1, array_length - entrypoint_);

// We count jumps from peeled loops to outside of the loop twice.
    bool is_loop_peeling_iteration = false;
    base::Optional<int> peeled_loop_end;
    interpreter::BytecodeArrayIterator iterator(bytecode().object());
    for (iterator.SetOffset(entrypoint_); !iterator.done();
         iterator.Advance()) {
      interpreter::Bytecode bytecode = iterator.current_bytecode();
      if (allow_loop_peeling_ &&
          bytecode_analysis().IsLoopHeader(iterator.current_offset())) {
        const compiler::LoopInfo& loop_info =
            bytecode_analysis().GetLoopInfoFor(iterator.current_offset());
        // Generators use irreducible control flow, which makes loop peeling too
        // complicated.
        if (loop_info.innermost() && !loop_info.resumable() &&
            (loop_info.loop_end() - loop_info.loop_start()) <
                v8_flags.maglev_loop_peeling_max_size &&
            (!v8_flags.maglev_loop_peeling_only_trivial ||
             loop_info.trivial())) {
          DCHECK(!is_loop_peeling_iteration);
          is_loop_peeling_iteration = true;
          loop_headers_to_peel_.Add(iterator.current_offset());
          peeled_loop_end = bytecode_analysis().GetLoopEndOffsetForInnermost(
              iterator.current_offset());
        }
      }
      if (interpreter::Bytecodes::IsJump(bytecode)) {
        if (is_loop_peeling_iteration &&
            bytecode == interpreter::Bytecode::kJumpLoop) {
          DCHECK_EQ(iterator.next_offset(), peeled_loop_end);
          is_loop_peeling_iteration = false;
          peeled_loop_end = {};
        }
        if (iterator.GetJumpTargetOffset() < entrypoint_) {
          static_assert(kLoopsMustBeEnteredThroughHeader);
          if (predecessors_[iterator.GetJumpTargetOffset()] == 1) {
            // We encoutered a JumpLoop whose loop header is not reachable
            // otherwise. This loop is either dead or the JumpLoop will bail
            // with DeoptimizeReason::kOSREarlyExit.
            predecessors_[iterator.GetJumpTargetOffset()] = 0;
          }
        } else {
          predecessors_[iterator.GetJumpTargetOffset()]++;
        }
        if (is_loop_peeling_iteration &&
            iterator.GetJumpTargetOffset() >= *peeled_loop_end) {
          // Jumps from within the peeled loop to outside need to be counted
          // twice, once for the peeled and once for the regular loop body.
          predecessors_[iterator.GetJumpTargetOffset()]++;
        }
        if (!interpreter::Bytecodes::IsConditionalJump(bytecode)) {
          predecessors_[iterator.next_offset()]--;
        }
      } else if (interpreter::Bytecodes::IsSwitch(bytecode)) {
        for (auto offset : iterator.GetJumpTableTargetOffsets()) {
          predecessors_[offset.target_offset]++;
        }
      } else if (interpreter::Bytecodes::Returns(bytecode) ||
                 interpreter::Bytecodes::UnconditionallyThrows(bytecode)) {
        predecessors_[iterator.next_offset()]--;
        // Collect inline return jumps in the slot after the last bytecode.
        if (is_inline() && interpreter::Bytecodes::Returns(bytecode)) {
          predecessors_[array_length - 1]++;
          if (is_loop_peeling_iteration) {
            predecessors_[array_length - 1]++;
          }
        }
      }
      // TODO(leszeks): Also consider handler entries (the bytecode analysis)
      // will do this automatically I guess if we merge this into that.
    }
    if (!is_inline()) {
      DCHECK_EQ(0, predecessors_[bytecode().length()]);
    }
  }

The CalculatePredecessorCounts function is responsible for analyzing a JavaScript bytecode array and calculating how many times each bytecode is targeted by a control flow operation such as a jump, switch, or return. This information is typically used for control flow analysis, which is an essential part of optimizing compilers like V8’s Maglev.

Here’s a step-by-step breakdown of the CalculatePredecessorCounts function:

Initialization: An array predecessors_ is allocated and initialized to store the count of predecessors for each bytecode. The size of the array is one more than the length of the bytecode array to handle edge cases.

Loop Peeling Detection: Loop peeling is an optimization technique where the first iteration of the loop is executed separately from the rest of the iterations. This can make the loop body more streamlined and optimize away conditional checks for the first iteration. The function checks for loop headers and decides whether to peel the loop based on several conditions (e.g., size of the loop, triviality, resumability).

Iterating Over Bytecodes: The function iterates over the bytecodes starting from entrypoint_ using an interpreter::BytecodeArrayIterator.

Jump Handling: If a jump bytecode is encountered, the function updates the predecessors_ array based on the jump target. Special handling is applied for loops and peeled loops. If the jump is within a loop being peeled, the target’s count is incremented twice to account for the peeled iteration.

Switch Handling: For switch bytecodes, all target offsets in the jump table have their predecessor counts incremented.

Return and Throw Handling: Bytecodes that represent returns or unconditional throws decrement the predecessor count for the next bytecode. If the function is inlined (is_inline()), it also increments the count for a placeholder after the last bytecode.

Inline Return Jumps: For inlined functions, the return jumps are collected in the slot after the last bytecode. If loop peeling is in progress, the count is incremented twice.

Verification: If the function is not inlined, it checks that the last bytecode does not have any predecessors, as expected.

TODO: There’s a TODO comment about considering handler entries, indicating that there may be future work to integrate this with broader bytecode analysis.

This function is part of the Maglev JIT compiler’s optimization process in the V8 JavaScript engine. It’s used in the phases of the compiler that deal with control flow analysis to optimize how the code is executed. Knowing the number of predecessors for each bytecode can help the compiler make decisions about inlining, dead code elimination, loop unrolling, and other optimizations.

The CalculatePredecessorCounts function is an example of the kind of static analysis that compilers perform to understand the structure of the code and optimize it before generating machine code. The predecessor count is crucial for constructing a Control Flow Graph (CFG), which is a representation of all paths that might be traversed through a program during its execution. The CFG is then used by the Maglev JIT compiler to generate optimized machine code that executes efficiently on the CPU.

ReduceResult Class

The ReduceResult class in Maglev-Graph-Builder.h deals with the result of some form of reduction process, which is a common operation in compiler optimization where expressions are simplified or transformed.

Here’s an explanation of the class and its members:

ReduceResult uses an internal enum Kind to represent the different states of the reduction result. These states include:

kDoneWithValue: The reduction is complete and has produced a value.
kDoneWithAbort: The reduction is complete, but it led to an abort condition.
kDoneWithoutValue: The reduction is complete but didn’t produce a value.
kFail: The reduction failed.
kNone: No reduction result is available, which is the default state.

The class has a private member payload_ of type base::PointerWithPayload. This type is a template that combines a pointer to a ValueNode with a Kind payload in a space-efficient manner. The Kind payload uses 3 bits.

The public interface of ReduceResult provides several constructors and methods for creating instances of the class and querying their state:

The default constructor sets the state to kNone.
The constructor that takes a ValueNode* sets the state to
kDoneWithValue and ensures the given ValueNode* is not null.
The value() method asserts that the result has a value (kDoneWithValue) and returns the associated ValueNode*.
The HasValue(), IsNone(), IsDone(), IsFail(), IsDoneWithValue(), IsDoneWithoutValue(), and IsDoneWithAbort() methods check for specific states.
The Done(…), DoneWithAbort(), and Fail() methods are static factory functions that create ReduceResult instances with the corresponding states.

There are also default copy constructor and assignment operator overloads to ensure ReduceResult can be copied and assigned as expected.

This class is designed to be used within the V8 JavaScript engine’s compilation process during the optimization phase. It encapsulates the result of an operation and provides an efficient way to check its outcome and access any resulting value. The use of base::PointerWithPayload suggests an emphasis on memory efficiency, which is important in performance-critical code like a JavaScript engine.

Root Cause Analysis

With linking up all the background of related functions of the vulnerability and reference to the POC JavaScript program, it shows that Check SMI (Small Integer) will be done by compiler and the check will fail because the index of the array a8 is set as 1073741824 , which is 2^30. The SMI upper ceiling in range is 2^30 – 1. However, the Maglev compiler simply call and reduce the precedence count, and save the reduced results for further compilation, which causes the fatal error.

Afterwards, we study to the patch/fix of this bug or vulnerability, it is found that it is no longer reducing the precedence count and also save the results to ValueNode instead of ReduceResult to avoid further compilation.

The bug is submitted and proven to be a duplicate, and will be open in 120 days. Unfortunately, we cannot decide whether there is any security impact.

Credit: We are very thankful to Carl Smith from Google V8 Security for the deep collaborations and sharing. Have a good time at CCC.

Root Cause Analysis of Maglev Graph Builder Bug or Vulnerability?!

Written by VXRL