Binary diff between libc from ScientificLinux and CentOS

In our IRC channel the question arose how we can see the difference between binaries build on different systems (SL and CentOS in this case).

Our initial test binary was created using this snippet:

yum install vala glib2-devel
echo 'print("Hello");' > a.vala
valac a.vala && objdump -D a

We came then up with this small script to automate the diff creation:

dump() {
TMPFILE=$(mktemp)
objdump -D $1 > $TMPFILE
cat $TMPFILE | sed "s/^[[:space:]]\+[0-9a-f]\+//"
rm $TMPFILE
}
 
cmp_dump() {
dump $1 > dump_a
dump $2 > dump_b
diff -u dump_a dump_b
rm dump_a dump_b
}
 
# Usage: cmp_dump first-binary second-binary
$@

So to compare two binaries and return the diff between their obj dump output (without addresses) you run:

bash cmp_dump.sh libc-sl libc-centos

This lead us to this result for libc from CentOS and ScientificLinux:

    --- dump_a      2013-09-11 12:39:30.799452944 +0200
    +++ dump_b      2013-09-11 12:39:32.061467634 +0200
    @@ -1,5 +1,5 @@
     
    -libc-2.12.so-centos6:     file format elf64-x86-64
    +libc-2.12.so-sl6:     file format elf64-x86-64
     
     
     Disassembly of section .note.gnu.build-id:
    @@ -13,13 +13,18 @@
     :      00 00                   add    %al,(%rax)
     :      47                      rex.RXB
     :      4e 55                   rex.WRX push %rbp
    -:      00 34 95 f7 bc a6 d9    add    %dh,-0x26594309(,%rdx,4)
    -:      9e                      sahf  
    -:      30 1d 7c 9f 95 9a       xor    %bl,-0x656a6084(%rip)        # ffffffff9a95a209 <_end+0xffffffff9a5c7961>
    -:      87 0f                   xchg   %ecx,(%rdi)
    -:      8e 77 c0                mov    -0x40(%rdi),%?
    -:      f0                      lock
    -:      b7                      .byte 0xb7
    +:      00 9a df 3d a2 9c       add    %bl,-0x635dc221(%rdx)
    +:      0c 20                   or     $0x20,%al
    +:      70 7f                   jo     308 <data.10540+0x2a8>
    +:      ed                      in     (%dx),%eax
    +:      6d                      insl   (%dx),%es:(%rdi)
    +:      49 9a                   rex.WB (bad)
    +:      2b 20                   sub    (%rax),%esp
    +:      0e                      (bad)  
    +:      67                      addr32
    +:      6b                      .byte 0x6b
    +:      68                      .byte 0x68
    +:      b1                      .byte 0xb1
     
     Disassembly of section .note.ABI-tag:
     
    @@ -486756,6 +486761,5 @@
     :      62                      (bad)  
     :      75 67                   jne    79 <data.10540+0x19>
     :      00 00                   add    %al,(%rax)
    -:      c7                      (bad)  
    -:      fc                      cld    
    -:      87 37                   xchg   %esi,(%rdi)
    +:      da 38                   fidivrl (%rax)
    +:      73 e0                   jae    fffffffffffffff8 <_end+0xffffffffffc6d750>

This ain’t a big difference for a 2M+ file, is it?

So is this a valuable way to compare binaries?

At the end there is the question: Can this be used to compare if binaries from an RPM were really created from the sources of it’s SRPM?

::: {#footer} [ September 11th, 2013 12:52pm ]{#timestamp} [fedora]{.tag} [centos]{.tag} [sl]{.tag} [scientificlinux]{.tag} [binary]{.tag} [diff]{.tag} :::