Hacker News

It's not me, it's the compiler

Comments

It's not me, it's the compiler! Every programmer has, at least once, thought to themselves that "it's not me, it's the compiler!". Usually, we're wrong, but this is the story of the time I was actually right.

On a Saturday evening, as one usually does, I was refactoring my JavaScript engine's parser. Earlier I had written this code in the project:

impl LexerConsumer {
    #[inline]
    pub fn consume(&mut self) {
        self.0 += 1;
    }

    #[inline]
    pub fn consume_test(&mut self, store: &LexStore, expected: TokenKind) -> bool {
        if self.peek(store) == expected {
            self.consume();
            true
        } else {
            false
        }
    }
}

Which is on its own a fairly common pattern in a parser, but I didn't love the shape of this code. Each branch is returning the value of the comparison, so what if I just returned the value myself? I got to work and wrote this version instead, and I was happy with the aesthetics of the generated asm in isolation - it was a few bytes shorter and didn't have the branch.

#[inline]
pub fn consume_test(&mut self, store: &LexStore, expected: TokenKind) -> bool {
    let x = self.peek(store) == expected;
    self.0 += x as u32;
    x
}
mov eax,DWORD PTR [rdi]
mov ecx,eax
and ecx,0x3f
movzx ecx,BYTE PTR [rsi+rcx*1]
cmp cl,dl
jne 233263
inc eax
mov DWORD PTR [rdi],eax
cmp cl,dl
sete al
ret
|
mov ecx,DWORD PTR [rdi]
mov r8d,ecx
and r8d,0x3f
xor eax,eax
cmp BYTE PTR [rsi+r8*1],dl
sete al
add ecx,eax
mov DWORD PTR [rdi],ecx
ret

So I went ahead to try and parse a simple statement, and right at the top of my bash history was a command to parse a for-loop.

> joe parse - 'for (var lol; false; false) {}'
Error was found
Diagnostic { kind: E079, flag: Flag(11529215046068469773), byte: 5, current_token: Var }
Parse error

Wait! What just happened?! Did I somehow mess up the function? Maybe x as u32 has different semantics than I remember... let me just write it more explicitly.

#[inline]
pub fn consume_test(&mut self, store: &LexStore, expected: TokenKind) -> bool {
    let x = self.peek(store) == expected;
    self.0 += if x { 1 } else { 0 };
    x
}

And somehow this version was working!

> joe parse - 'for (var lol; false; false) {}'
Parsed 8 nodes in 14ns
Raw nodes:
[0] POS=0 ScriptStart payload=0
[1] POS=9 VariableDeclaration payload=0
[2] POS=14 FalseLit payload=0
[3] POS=21 FalseLit payload=0
[4] POS=28 BlockStatementStart payload=0
[5] POS=29 BlockStatement payload=0
[6] POS=0 ForStatement payload=0
[7] POS=0 Script payload=0
POS=000 [7] Script
POS=000 [6] ForStatement
POS=009 [1] VariableDeclaration @A0
POS=014 [2] FalseLit
POS=021 [3] FalseLit
POS=029 [5] BlockStatement

For a second I doubted my sanity, but I was fairly confident that I know what bool as u32 does - I've written that exact cast hundreds of times. So at that moment I said what any desperate and mildly confused programmer would: It's not me, it's the compiler!

I had to look at what my for statement parser is really doing. Lucky for me, with my in-house custom build system, looking at the assembly of any function is a command away. And no, it's not a cargo asm wrapper - my project doesn't use cargo, since I've made my own build system in TypeScript and it runs on Deno, which actually supports --dry mode as well, so you can see what it runs under the hood. Yayyy transparency!

> x -b --fn ForOrInOfStatement 1 --dry
MKDIR ./out
RUN rustc src/main.rs --crate-name=joe --crate-type=staticlib --edition=2024 --out-dir=./out --target=x86_64-unknown-linux-gnu --cfg joe_no_libc --emit=link,obj -Crelocation-model=static -Copt-level=3 -Clto -Ccodegen-units=1 -Cdebuginfo=line-tables-only --diagnostic-width=150 -Cpanic=abort -Ztemps-dir=out/tmp -Zhuman-readable-cgu-names --extern proc=./out/libproc.so
RUN mold -melf_x86_64 -o ./out/joe ./out/joe.o --static --package-metadata="Joe!" -zrelro -znoexecstack --discard-locals --build-id --gc-sections --no-undefined --icf=safe --compress-debug-sections=zlib
RUN bash -c nm out/joe | rustfilt | grep ForOrInOfStatement | grep -oP '^[0-9a-f]+ t\s+\K.+' | sed -n '1p' | xargs -I {} objdump -WK -M intel -d --disassembler-color=on --visualize-jumps=color --demangle=rust ./out/joe --disassemble={} 2>/dev/null | grep -v 'Disassembly of section' | grep -v './out/joe:' | less -R
# line breaks added for readability, btw. you're welcome.

Anyway... back to inspecting the assembly, but maybe you should see the Rust version first. Here is a trimmed down version of the code - the real function is a few hundred lines long :D

fn ForOrInOfStatement(
    store: &mut Storage,
    mut lex: LexerConsumer,
    mut emit: EmitBuffer,
    mut stack: StateStack,
) -> Termination {
    debug_assert_eq!(lex.peek(store), TokenKind::For);
    lex.consume(); // `for`

    if lex.peek_test(store, TokenKind::Await/*=0x6e*/) {
        if !stack.has_flag(Flag::AWAIT) {
            return raise_diagnostic(store, lex, emit, stack, DiagnosticKind::E056);
        }
        if !lex.consume_test(store, TokenKind::LParen) {
            return raise_diagnostic(store, lex, emit, stack, DiagnosticKind::E057);
        }
        wip!()
    }

    if !lex.consume_test(store, TokenKind::LParen/*=0x04*/) {
        return raise_diagnostic(store, lex, emit, stack, DiagnosticKind::E058);
    }

    match lex.peek(store) {
        TokenKind::Var => { /* */ }
        // ...
        _ => {
            stack.use_flags(store, FlagDiff::clear(Flag::IN)); // [~In]
            stack.push(store, State::ForOrInOfStatement_decide);
            LeftHandSideExpression(store, lex, emit, stack)
        },
    }
}

fn PrimaryExpression(
    store: &mut Storage,
    mut lex: LexerConsumer,
    mut emit: EmitBuffer,
    mut stack: StateStack,
) -> Termination {
    match lex.peek(store) {
        // ...
        _ => raise_diagnostic(store, lex, emit, stack, DiagnosticKind::E079),
    }
}

And now the reveal:

> x -b --fn ForOrInOfStatement 1
000000000021e290 <ForOrInOfStatement>:
21e290: ff c6                   inc    esi
21e292: 89 f0                   mov    eax,esi
21e294: 83 e0 3f                and    eax,0x3f
21e297: 0f b6 04 07             movzx  eax,BYTE PTR [rdi+rax*1]
21e29b: 83 f8 04                cmp    eax,0x4           # cmp TokenKind::LParen
21e29e: ,----- 75 70            jne    21e310
21e2a0: |       4d 85 c0       test   r8,r8
21e2a3: |     ,-- 78 14         js     21e2b9
21e2a5: |     |   41 0f b6 c0   movzx  eax,r8b
21e2a9: |     |   c6 84 07 10 56 00 00  mov    BYTE PTR [rdi+rax*1+0x5610],0x20
21e2b0: |     |   20
21e2b1: |     |   49 ff c0       inc    r8
21e2b4: |     |   e9 d7 45 00 00 jmp    222890
21e2b9: |     '-> 48 b8 00 ff ff ff ff  movabs rax,0x7fffffffffffff00
21e2c0: |        ff ff 7f
21e2c3: |        4c 21 c0       and    rax,r8
21e2c6: |        45 0f b6 c8    movzx  r9d,r8b
21e2ca: |        42 c6 84 0f 10 56 00  mov    BYTE PTR [rdi+r9*1+0x5610],0x0
21e2d1: |        00 00
21e2d3: |        45 89 c9       mov    r9d,r9d
21e2d6: |        4d 89 c2       mov    r10,r8
21e2d9: |        49 c1 ea 10    shr    r10,0x10
21e2dd: |        49 bb 00 00 00 00 ff  movabs r11,0xffff00000000
21e2e4: |        ff 00 00
21e2e7: |        4d 21 d3       and    r11,r10
21e2ea: |        4e 89 9c cf d0 56 00  mov    QWORD PTR [rdi+r9*8+0x56d0],r11
21e2f1: |        00
21e2f2: |        41 ff c0       inc    r8d
21e2f5: |        45 0f b6 c0    movzx  r8d,r8b
21e2f9: |        49 09 c0       or     r8,rax
21e2fc: |        41 0f b6 c0    movzx  eax,r8b
21e300: |        c6 84 07 10 56 00 00  mov    BYTE PTR [rdi+rax*1+0x5610],0x20
21e307: |        20
21e308: |        49 ff c0       inc    r8
21e30b: |        e9 80 45 00 00 jmp    222890
21e310: '---->   48 89 ca       mov    rdx,rcx
21e313:         83 f8 6e       cmp    eax,0x6e           # cmp TokenKind::Await
21e316:       ,-- 75 15        jne    21e32d
21e318:       |   4c 89 c1     mov    rcx,r8
21e31b:       |   49 0f ba e0 3d bt     r8,0x3d
21e320:       ,--|-- 72 19     jb     21e33b
21e322:       |  |   41 b8 37 00 00 00  mov    r8d,0x37
21e328:       |  |   e9 b3 ae 00 00 jmp    2291e0
21e32d:       |  '-> 4c 89 c1  mov    rcx,r8
21e330:       |      41 b8 39 00 00 00  mov    r8d,0x39
21e336:       |      e9 a5 ae 00 00 jmp    2291e0
21e33b:       '----> 41 b8 38 00 00 00  mov    r8d,0x38
21e341:              e9 9a ae 00 00 jmp    2291e0

Oh! That's it?! Where did my match statement go?! I already knew what my ForOrInOfStatement used to look like prior to the change - it was a monster with a few register-to-stack spills that I was trying to get rid of earlier that week.

But looking more closely at the asm, we can see the compare against TokenKind::LParen which is almost certainly coming from our consume_test(store, TokenKind::LParen). But first, a little note on the parser's ABI, and in particular SysV ABI, which is the default behaviour in Rust for now.

pub struct LexerConsumer(u32 /* pos */, u32 /* buffer_len */);
pub struct EmitBuffer(usize);
pub struct StateStack(u64);

fn Foo(
    store: &mut Storage, // rdi
    mut lex: LexerConsumer, // rsi, rdx
    mut emit: EmitBuffer, // rcx
    mut stack: StateStack, // r8
) -> Termination {...}

So the assembly is now pretty straight forward to read:

# lex.consume() // `for`
21e290: ff c6                   inc    esi

# the lexer's ring buffer indexing math:
21e292: 89 f0                   mov    eax,esi
21e294: 83 e0 3f                and    eax,0x3f

# load the token at the current position
21e297: 0f b6 04 07             movzx  eax,BYTE PTR [rdi+rax*1]

# compare it with a TokenKind::LParen
21e29b: 83 f8 04                cmp    eax,0x4

# If not a LParen jump to 21e310
21e29e: 75 70                   jne    21e310

# if we're here the token was a LParen, and we're in the `consume_test`.
# but... didn't consume_test have a `self.0 += x as u32`?
# WAIT.... WHO ATE MY `INC ESI`?!

So yeah, that was pretty much the confirmation that the compiler is wrong. The rest was writing a minimum reproducible example. Since I was on 1.94.0-nightly, I suspected that perhaps this is fixed on nightly. I checked and it wasn't! Which meant I had to try and see if I can find the culprit.

My first attempt was looking at the LLVM IR before any optimization, so I added -Cno-prepopulate-passes to my build flags and saw a load from uninitialized memory. Then to see if it was a Rust/MIR issue vs an LLVM issue, I also used -Zmir-opt-level=0 and the issue was fixed.

I opened an issue and I was a bit nervous about it at first - "this seems like such a common pattern, what if I'm just being dumb and embarrass myself in public?" And that's why the title of the issue says "Suspected miscompilation" - that word was what I tried to hide behind in case I was being a silly cat (meow). By the time I opened the issue it was already late and I had to sleep.

Next day, I woke up and there was already a PR open with the fix from Hanna Kruppe, which was the best turnaround. The issue was labeled p-critical and i-miscompile. Out of +61K Rust issues, there only 7 (including this one) that are both p-critical and i-miscompile. To me, those are the most dangerous kind of bugs a compiler can have, given that they violate the contract between the programmer and the language. They show that not every safe code you write is safe. And also more generally, there are only 247 p-critical issues to begin with.

The best part was seeing the community responding to the issue so fast and having an actual fix landed within 18 hours. That was the most impressive part of this story, and kudos to tmiasko and hanna-kruppe for the quick response time.

Now I get to live and tell the story of the time I fought rustc and won :D

Comments

No comments yet. Start the discussion.