Debugging xmame Segfaults on X86-64


Background: I don't do anything crazy. No overclocking. No exotic CFLAGs. These games really do work on 32-bit but fail on 64-bit (X86-64). And this is consistent across many versions of gcc, CFLAGS, glibc, xorg/Xfree. Thus I believe the MAME code itself is the culprit.

Since none of the mamedevs have X86-64 machines at this time it falls upon the xmame users to debug these for now.

Methodology:
1. I build a debug xmame build with this makefile, which uses no optimizations (-O0), MY_CPU=amd64, uses -g for gdb debugging, and does not strip the binary.
2. Basic gdb usage to determine which line causes the segfault.

  • gdb xmamed.x11
  • run -rompath [path] game
  • bt
    3. Use printfs to find the culprit

    src/vidhrdw/seibuspi.c

    Games Affected:

    rdft2 and all others using this driver

    Description:
    rdft2 segfaults immediately. Appears to be caused by a memcpy being fed the wrong address. The array index is out-of-bounds.
    Propose fixing the out-of-bounds array index.

    video_dma_address is 0 0 so spimainram index is fffffe00 -512
    video_dma_address is 0 0 so spimainram index is fffffe00 -512
    video_dma_address is 0 0 so spimainram index is fffffe00 -512
    starting here... video_dma_address is now 229376...
    starting here... video_dma_address is now 245760...
    starting here... video_dma_address is now 225280...
    video_dma_address is 37000 225280 so spimainram index is da00 55808
    
    Program received signal SIGSEGV, Segmentation fault.
    [Switching to Thread 16384 (LWP 2320)]
    0x0000002a9641b7f4 in memcpy () from /lib/libc.so.6
    (gdb) bt
    #0  0x0000002a9641b7f4 in memcpy () from /lib/libc.so.6
    #1  0x0000000001b109e3 in sprite_dma_start_w (offset=0, data=2400, mem_mask=4294901760) at src/vidhrdw/seibuspi.c:208
    #2  0x0000000000470aef in program_write_word_32le (address=0, data=2400) at src/memory.c:2461
    #3  0x0000000000527ba1 in WRITE16 (ea=1292, value=2400) at i386.h:324
    #4  0x000000000052f825 in i386_mov_rm16_r16 () at i386op16.c:1076
    #5  0x000000000053aab6 in i386_decode_opcode () at src/cpu/i386/i386.c:313
    #6  0x000000000052c9e8 in i386_operand_size () at i386ops.c:2085
    #7  0x000000000053aaa2 in i386_decode_opcode () at src/cpu/i386/i386.c:311
    #8  0x000000000053b619 in i386_execute (num_cycles=2314) at src/cpu/i386/i386.c:539
    #9  0x0000000000465c3c in cpunum_execute (cpunum=0, cycles=2314) at src/cpuintrf.c:1399
    #10 0x000000000046742f in cpu_timeslice () at src/cpuexec.c:836
    #11 0x0000000000466c4f in cpu_run () at src/cpuexec.c:415
    #12 0x000000000040574d in run_machine_core () at src/mame.c:572
    #13 0x00000000004055eb in run_machine () at src/mame.c:509
    #14 0x0000000000405245 in run_game (game=4610) at src/mame.c:330
    #15 0x0000000002092c2a in main (argc=4, argv=0x7fbffff7b8) at src/unix/main.c:91
    

    PROPOSED FIXES The patch is available here.


    src/vidhrdw/wecleman.c

    Games Affected:

    hotchase
    wecleman
    

    Description:
    wecleman segfaults shortly into the attract mode during demo play.

    The problem:
    This is caused by pointer/integer casting.

    Compiling src/vidhrdw/wecleman.c ...
    src/vidhrdw/wecleman.c: In function `do_blit_zoom16':
    src/vidhrdw/wecleman.c:321: warning: cast from pointer to integer of different size
    src/vidhrdw/wecleman.c:328: warning: cast to pointer from integer of different size
    src/vidhrdw/wecleman.c:352: warning: cast from pointer to integer of different size
    src/vidhrdw/wecleman.c:359: warning: cast to pointer from integer of different size
    src/vidhrdw/wecleman.c:389: warning: cast from pointer to integer of different size
    src/vidhrdw/wecleman.c:396: warning: cast to pointer from integer of different size
    

    FIXES The patch is available here.
    or an alternate version if anyone prefers here


    src/sound/iremga20.c

    Games Affected:

    bmaster
    gunforce
    lethalth
    rtypeleo
    psoldier
    

    Description:
    bmaster segfaults shortly into the attract mode during demo play.
    rtypeleo, gunforce, lethalth, and psoldier all segfault before the title screen forms.

    The problem:

    Program received signal SIGSEGV, Segmentation fault.
    0x00000000007a822c in IremGA20_update (param=0, buffer=0x7fbffff480, length=57)
        at src/sound/iremga20.c:105
    105                     MIX_CH(2);
    (gdb) bt
    #0  0x00000000007a822c in IremGA20_update (param=0, buffer=0x7fbffff480,
        length=57) at src/sound/iremga20.c:105
    #1  0x000000000077aac4 in streams_sh_update () at src/sound/streams.c:140
    #2  0x000000000077a401 in sound_update () at src/sndintrf.c:1419
    #3  0x00000000004056c6 in updatescreen () at src/mame.c:1368
    #4  0x00000000004672db in cpu_vblankcallback (param=0) at src/cpuexec.c:1613
    #5  0x0000000000477305 in mame_timer_set_global_time (newbase=
          {seconds = 30, subseconds = 849999999999998766}) at src/timer.c:340
    #6  0x0000000000465e3f in cpu_timeslice () at src/cpuexec.c:805
    #7  0x000000000046544b in cpu_run () at src/cpuexec.c:399
    #8  0x000000000040421a in run_machine_core () at src/mame.c:565
    #9  0x00000000004040b8 in run_machine () at src/mame.c:502
    #10 0x0000000000403d58 in run_game (game=1126) at src/mame.c:329
    #11 0x0000000001d8d4eb in main (argc=6, argv=0x7fbffff758)
        at src/unix/main.c:95
    

    Also of note:
    Compiling src/sound/iremga20.c ...
    src/sound/iremga20.c: In function `IremGA20_update':
    src/sound/iremga20.c:92: warning: cast from pointer to integer of different size
    src/sound/iremga20.c:93: warning: cast from pointer to integer of different size
    src/sound/iremga20.c:94: warning: cast from pointer to integer of different size
    src/sound/iremga20.c:103: warning: cast to pointer from integer of different size
    src/sound/iremga20.c:104: warning: cast to pointer from integer of different size
    src/sound/iremga20.c:105: warning: cast to pointer from integer of different size
    src/sound/iremga20.c:106: warning: cast to pointer from integer of different size
    src/sound/iremga20.c:109: warning: cast to pointer from integer of different size
    src/sound/iremga20.c:110: warning: cast to pointer from integer of different size
    

    The problem:
    Casting pointers to an integer is Very Wrong(tm) for 64-bit systems. These should be casted to unsigned long types, which fixes both the warning and the segfault, and should still work fine on 32-bit systems.

    REAL FIXES The patch is available here.


    src/vidhrdw/itech32.c

    Games Affected:

    sftm
    timekill
    

    Description:
    sftm and timekill crash early in the attract mode; sftm crashes before the title screen logo forms.

    The problem:

    Program received signal SIGSEGV, Segmentation fault.
    0x0000000001177576 in draw_raw (base=0x2a976ca010, color=2816)
        at src/vidhrdw/itech32.c:544
        544                                                             dstbase[sx >> 8] = pixel | color;
    

    printfs reveal the following patterns prior to segfault:

    sx is 1fa00 129536 and shifted right 8 our index is 1fa 506
    sx is 1fb00 129792 and shifted right 8 our index is 1fb 507
    sx is 1fc00 130048 and shifted right 8 our index is 1fc 508
    sx is 1fd00 130304 and shifted right 8 our index is 1fd 509
    sx is 1fe00 130560 and shifted right 8 our index is 1fe 510
    sx is 1ff00 130816 and shifted right 8 our index is 1ff 511
    sx is 80000 524288 and shifted right 8 our index is 800 2048
    

    So sx gets very large.

    startx was 0
    startx was 0
    startx was 0
    startx was 458752 and at this time VIDEO_TRANSFER_X was 0x700 1792
    

    sx comes from startx and VIDEO_TRANSFER_X, which get large. So again it appears that an array index (sx in this case) is overstepping its bounds. Until this point sx always falls betwen 0 and 511 inclusive.

    REAL FIXES
    R. Belmont supplied a workaround to mask these array indices, similar to how compute_safe_address already did. This was sufficient to fix timekill, but not sftm.
    Also extended this to line 537 to fix sftm as well.

    dstbase = &base[compute_safe_add        ress(sx >> 8, sy >> 8) - (sx >> 8)];
    
    becomes
    dstbase = &base[compute_safe_add        ress(sx >> 8, sy >> 8) - ((sx >> 8) & vram_xmask)];
    

    src/vidhrdw/psikyosh.c

    Games Affected:

    gunbird2
    soldivid
    

    Description:
    soldivid crashes immediately on X86-64. It crashes in src/vidhrdw/psikyosh.c, line 324, in psikyosh_drawbackground().

    bg_pri[i] = (psikyosh_bgram[(BG_TYPE(i)*0x800)/4 + 0x400/4 - 0x4000/4] & 0xff000000) >> 24;
    

    The problem:
    BG_TYPE(0) returns 0! This means psikyosh_bgram[-3840] which is clearly illegal. Somehow X86 is just alot more forgiving about this, but it is still incorrect regardless of platform.

    Is BG_TYPE returning the correct value? 0 seems strange.

    REAL FIXES:
    As per Hans de Goede's email :

    "src/vidhrdw/psikyosh.c: psikyosh_bgram got addressed (indexed) with a
    negative value. The cause of this is that the bg layer priotity got
    calculated before checking if the layer was enabled (which it was not in
    this case) and before checking if the bg layer type used to calc the
    priority was a valid type (which it was not in this case, which is ok
    because the layer was disabled, but an enabled layer can still have an
    invalid layer type which could cause a negative address again)
    
    "Both the layer being enabled and the layer type are checked now before
    calculating the priority. This stops soldivid from crashing on x86_64,
    besides that the functionality is 100% identical because the
    calculated priority never got used if the layer was not enabled or the
    type was not valid."
    

    src/sound/ym2151.c

    Games Affected:

    ctribe
    ddragon
    ddragon2
    galaga88
    gauntlet
    gunforce
    hachoo
    hydra
    kengo
    mercs
    msword
    scontra
    sf1
    stunrun
    term2
    toobin
    willow
    

    Description:
    Tested on several xmame versions; most recently xmame-0.87.1. 32-bit compiles work fine, but 64-bit compiles segfault.
    Example:stunrun segfaults in the attract mode just as the car hits the yellow plate.

    gdb pinpoints the "advance" function in ym2151.c.

    Program received signal SIGSEGV, Segmentation fault.
    0x00000000007dd845 in advance () at src/sound/ym2151.c:2186
    2186                                    (op+1)->phase += ( (PSG->freq[ kc_channel + (op+1)->dt2 ] + (op+1)->dt1) * (op+1)->mul ) >> 1;
    #0  0x00000000007dd845 in advance () at src/sound/ym2151.c:2186
    #1  0x00000000007dbe21 in YM2151UpdateOne (num=0, buffers=0x7fbffff4c0,
        length=8) at src/sound/ym2151.c:2459
    #2  0x000000000077b05b in stream_update (channel=0, min_interval=0)
        at src/sound/streams.c:272
    #3  0x000000000077d281 in YM2151UpdateRequest (chip=0)
        at src/sound/2151intf.c:64
    #4  0x000000000077d65e in YM2151_data_port_0_w (offset=0, data=221 '[]')
        at src/sound/2151intf.c:280
    #5  0x000000000046cdec in program_write_byte_8 (address=0, data=221 '[]')
        at src/memory.c:2238
    #6  0x0000000000599a7a in m6502_8d () at t6502.c:274
    #7  0x000000000059135b in m6502_execute (cycles=59)
        at src/cpu/m6502/m6502.c:272
    #8  0x00000000004642d2 in cpunum_execute (cpunum=3, cycles=59)
        at src/cpuintrf.c:1272
    #9  0x00000000004659e5 in cpu_timeslice () at src/cpuexec.c:752
    #10 0x000000000046544b in cpu_run () at src/cpuexec.c:399
    #11 0x000000000040421a in run_machine_core () at src/mame.c:565
    #12 0x00000000004040b8 in run_machine () at src/mame.c:502
    #13 0x0000000000403d58 in run_game (game=3530) at src/mame.c:329
    #14 0x0000000001d8d307 in main (argc=4, argv=0x7fbffff7a8)
        at src/unix/main.c:95
    

    Debugging with printf:
    Apply this patch to src/sound/ym2151.c, make, and run xmamed.x11 -rompath [path] stunrun

    ...
    PSG freq is 66c0
    steve_channel is -200, resetting kc_channel to zero
    op kc_i is 0
    mod_ind is ffffff38
    op+1 phase is ccb91c89
    kc_channel is 0
    op+1 dt2 is 0
    op+1 dt1 is ffffff5e
    op+1 mul is 1
    ...
    

    Observations:

  • mod ind is an INT32 but it is supposed to mimic a signed 8-bit value, but we see values in the debug output BELOW -128! (e.g. -200)
  • kc_channel is supposed to be unsigned, but somehow on X86-64/amd64 it stays negative, which results in an out-of-bounds access later on when we do (PSG->freq[ kc_channel + (op+1)->dt2 ])
  • This patch resets kc_channel to zero (is this proper?) and keeps the game from segfaulting in the attract mode. However, it is better to figure out the above two issues rather than using it.

    REAL FIXES:
    As per Hans de Goede kc_i should be initialized to 768 instead of 0.
    See the full text here Relevant portion of the patch is:

    Index: src/sound/ym2151.c
    ===================================================================
    RCS file: /home/cvs/mess/src/sound/ym2151.c,v
    retrieving revision 1.9
    diff -u -r1.9 ym2151.c
    --- src/sound/ym2151.c  28 Jan 2004 03:15:07 -0000      1.9
    +++ src/sound/ym2151.c  31 Oct 2004 08:21:55 -0000
    @@ -1576,6 +1576,7 @@
            {
    	                memset(&chip->oper[i],'\0',sizeof(YM2151Operator));
    			                chip->oper[i].volume = MAX_ATT_INDEX;
    					+               chip->oper[i].kc_i = 768; /* min kc_i value */
    					        }
    						 
    						         chip->eg_timer = 0;
    

    Cleaning up pointer/int casting

    xmame-specific


    src/unix/video-drivers/glxtool.c:414: warning: cast from pointer to integer of d
    ifferent size
    
  • Simple printf fix. This patch fixes it.

    MAME core

    src/mame.c: In function `mame_validitychecks':
    src/mame.c:1811: warning: cast from pointer to integer of different size
    
    src/common.c: In function `region_post_process':
    src/common.c:1308: warning: cast from pointer to integer of different size
    src/common.c: In function `read_rom_data':
    src/common.c:1459: warning: cast from pointer to integer of different size
    src/common.c: In function `fill_rom_data':
    src/common.c:1544: warning: cast from pointer to integer of different size
    src/common.c: In function `copy_rom_data':
    src/common.c:1558: warning: cast from pointer to integer of different size
    src/common.c: In function `rom_load':
    src/common.c:1845: warning: cast from pointer to integer of different size
    src/common.c:1870: warning: cast from pointer to integer of different size
    
    src/memory.c: In function `assign_dynamic_bank':
    src/memory.c:1310: warning: cast to pointer from integer of different size
    src/memory.c: In function `amentry_needs_backing_store':
    src/memory.c:1662: warning: cast from pointer to integer of different size
    src/memory.c:1672: warning: cast from pointer to integer of different size
    
    src/info.c: In function `print_game_rom':
    src/info.c:371: warning: cast from pointer to integer of different size
    src/info.c:407: warning: cast from pointer to integer of different size
    
    src/state.c: In function `ss_register_func':
    src/state.c:324: warning: cast from pointer to integer of different size
    
    src/cheat.c: In function `RebuildStringTables':
    src/cheat.c:2044: warning: cast from pointer to integer of different size
    src/cheat.c:2045: warning: cast from pointer to integer of different size
    src/cheat.c:2046: warning: cast from pointer to integer of different size
    src/cheat.c:2047: warning: cast from pointer to integer of different size
    src/cheat.c:2048: warning: cast from pointer to integer of different size
    src/cheat.c:2049: warning: cast from pointer to integer of different size
    src/cheat.c: In function `DefaultEnableRegion':
    src/cheat.c:7884: warning: cast from pointer to integer of different size
    src/cheat.c: In function `SetSearchRegionDefaultName':
    src/cheat.c:7982: warning: cast from pointer to integer of different size
    src/cheat.c: In function `BuildCPUInfoList':
    src/cheat.c:10320: warning: cast from pointer to integer of different size
    
    src/vidhrdw/vooddraw.h: In function `add_rasterizer':
    src/vidhrdw/vooddraw.h:207: warning: cast from pointer to integer of different size
    
    src/vidhrdw/wecleman.c: In function `do_blit_zoom16':
    src/vidhrdw/wecleman.c:321: warning: cast from pointer to integer of different size
    src/vidhrdw/wecleman.c:328: warning: cast to pointer from integer of different size
    src/vidhrdw/wecleman.c:352: warning: cast from pointer to integer of different size
    src/vidhrdw/wecleman.c:359: warning: cast to pointer from integer of different size
    src/vidhrdw/wecleman.c:389: warning: cast from pointer to integer of different size
    src/vidhrdw/wecleman.c:396: warning: cast to pointer from integer of different size