I just threw the new theme on my website and was poking around making tweaks this afternoon. I wanted slightly different colors, wanted to make the picture look cooler, maybe edit the footer to change the whole "Made by" to me, and give credit for being based on the theme I based it on. However, upon opening the footer.php, I found a very weird comment:

 
/* V8 - WARNING: This file is protected by copyright law.
To reverse engineer or decode this file is strictly prohibited. */
 

Well that's weird, because in the style.css we read:

/*The CSS, XHTML and design is released under GPL*/

(Side note, if you don't know what we mean by GPL, check out their site.)

No, they don't say PHP in there, however I read that (because 'design' is included) as "This theme is GPL'd". Poking around their website, I see no mention that you're required to keep any part of the theme the same.

If we read past the warning about reverse engineering, we see why they included it, a nasty big base64 encoded blob, then an eval command. Pastebin paste is here.

This piqued my interest, as I can think of very few legitimate reasons to do such obfuscation, or why there should be so much (footer.php is 47kb!). My initial thought was that I'd opened a backdoor into my site, with lesser thoughts to them being able to push random stuff into my footer (the last way I was infected), and finally just trying to control the links on the bottom of the page so that even if I were to edit their theme (as is my right under the GPL) I couldn't take credit for it myself, they'd always have credit for it. None of those sat right with me, so I hit up the local IRC channel, and we started puzzling.

First I base64_decode'd the garbage that was being run from the eval command (and yes, its gonna overflow the line by a LOT):

 
$lll=0;eval(base64_decode("JGxsbGxsbGxsbGxsPSdiYXNlNjRfZGVjb2RlJzs="));$ll=0;eval($lllllllllll("JGxsbGxsbGxsbGw9J29yZCc7"));$llll=0;$lllll=3;eval($lllllllllll("JGw9JGxsbGxsbGxsbGxsKCRvKTs="));$lllllll=0;$llllll=($llllllllll($l[1])<<8)+$llllllllll($l[2]);eval($lllllllllll("JGxsbGxsbGxsbGxsbGw9J3N0cmxlbic7"));$lllllllll=16;$llllllll="";for(;$lllll<$lllllllllllll($l);){if($lllllllll==0){$llllll=($llllllllll($l[$lllll++])<<8);$llllll+=$llllllllll($l[$lllll++]);$lllllllll=16;}if($llllll&0x8000){$lll=($llllllllll($l[$lllll++])<<4);$lll+=($llllllllll($l[$lllll])>>4);if($lll){$ll=($llllllllll($l[$lllll++])&0x0f)+3;for($llll=0;$llll<$ll;$llll++)$llllllll[$lllllll+$llll]=$llllllll[$lllllll-$lll+$llll];$lllllll+=$ll;}else{$ll=($llllllllll($l[$lllll++])<<8);$ll+=$llllllllll($l[$lllll++])+16;for($llll=0;$llll<$ll;$llllllll[$lllllll+$llll++]=$llllllllll($l[$lllll]));$lllll++;$lllllll+=$ll;}}else$llllllll[$lllllll++]=$llllllllll($l[$lllll++]);$llllll<<=1;$lllllllll--;}eval($lllllllllll("JGxsbGxsbGxsbGxsbD0nY2hyJzs="));$lllll=0;eval($lllllllllll("JGxsbGxsbGxsbD0iPyIuJGxsbGxsbGxsbGxsbCg2Mik7"));$llllllllll="";for(;$lllll<$lllllll;){$llllllllll.=$llllllllllll($llllllll[$lllll++]^0x07);}eval($lllllllllll("JGxsbGxsbGxsbC49JGxsbGxsbGxsbGwuJGxsbGxsbGxsbGxsbCg2MCkuIj8iOw=="));eval($lllllllll);
 

If you want to be able to READ that:

 
<?php
    $lll=0;
    eval(base64_decode("JGxsbGxsbGxsbGxsPSdiYXNlNjRfZGVjb2RlJzs="));
    $ll=0;
    eval($lllllllllll("JGxsbGxsbGxsbGw9J29yZCc7"));
    $llll=0;
    $lllll=3;
    eval($lllllllllll("JGw9JGxsbGxsbGxsbGxsKCRvKTs="));
    $lllllll=0;
    $llllll=($llllllllll($l[1])<<8)+$llllllllll($l[2]);
    eval($lllllllllll("JGxsbGxsbGxsbGxsbGw9J3N0cmxlbic7"));
    $lllllllll=16;
    $llllllll="";
    for(;$lllll<$lllllllllllll($l);){
        if($lllllllll==0){
            $llllll=($llllllllll($l[$lllll++])<<8);
            $llllll+=$llllllllll($l[$lllll++]);
            $lllllllll=16;
        }if($llllll&0x8000){
            $lll=($llllllllll($l[$lllll++])<<4);
            $lll+=($llllllllll($l[$lllll])>>4);
            if($lll){
                $ll=($llllllllll($l[$lllll++])&0x0f)+3;
                for($llll=0;$llll<$ll;$llll++)
                    $llllllll[$lllllll+$llll]=$llllllll[$lllllll-$lll+$llll];
                $lllllll+=$ll;
            }else{
                $ll=($llllllllll($l[$lllll++])<<8);
                $ll+=$llllllllll($l[$lllll++])+16;
                for($llll=0;
                    $llll<$ll;
                    $llllllll[$lllllll+$llll++]=$llllllllll($l[$lllll]));
                $lllll++;
                $lllllll+=$ll;
            }
        } else
            $llllllll[$lllllll++]=$llllllllll($l[$lllll++]);
        $llllll<<=1;
        $lllllllll--;
    }
    eval($lllllllllll("JGxsbGxsbGxsbGxsbD0nY2hyJzs="));
    $lllll=0;
    eval($lllllllllll("JGxsbGxsbGxsbD0iPyIuJGxsbGxsbGxsbGxsbCg2Mik7"));
    $llllllllll="";
    for(;$lllll<$lllllll;){
        $llllllllll.=$llllllllllll($llllllll[$lllll++]^0x07);
    }
    eval($lllllllllll(
        "JGxsbGxsbGxsbC49JGxsbGxsbGxsbGwuJGxsbGxsbGxsbGxsbCg2MCkuIj8iOw=="));
    eval($lllllllll);
?>
 

Wonderful, more base64 junk, and obfuscation. Yea, when was the last time you read a legitimate program that looked as the above (not in Perl). Been a while for me as well. So we went to work first by translating all the base_64 junk into english. Please note that the very first eval binds $lllllllllll to base64_decode so I've gone ahead and done that part already. I removed the eval() calls, and replaced them with the base64 translation on every line ending in //here.

 
<?php
 
    $lll=0;
    $lllllllllll='base64_decode'; //here
    $ll=0;
    $llllllllll='ord';  //here
    $llll=0;
    $lllll=3;
    $l=$lllllllllll($o); //here
    $lllllll=0;
    $llllll=($llllllllll($l[1])<<8)+$llllllllll($l[2]);
    $lllllllllllll='strlen'; //here
    $lllllllll=16;
    $llllllll="";
    for(;$lllll<$lllllllllllll($l);){
        if($lllllllll==0){
            $llllll=($llllllllll($l[$lllll++])<<8);
            $llllll+=$llllllllll($l[$lllll++]);
            $lllllllll=16;
        }if($llllll&0x8000){
            $lll=($llllllllll($l[$lllll++])<<4);
            $lll+=($llllllllll($l[$lllll])>>4);
            if($lll){
                $ll=($llllllllll($l[$lllll++])&0x0f)+3;
                for($llll=0;$llll<$ll;$llll++)
                    $llllllll[$lllllll+$llll]=$llllllll[$lllllll-$lll+$llll];
                $lllllll+=$ll;
            }else{
                $ll=($llllllllll($l[$lllll++])<<8);
                $ll+=$llllllllll($l[$lllll++])+16;
                for($llll=0;
                    $llll<$ll;
                    $llllllll[$lllllll+$llll++]=$llllllllll($l[$lllll]));
                $lllll++;
                $lllllll+=$ll;
            }
        } else
            $llllllll[$lllllll++]=$llllllllll($l[$lllll++]);
        $llllll<<=1;
        $lllllllll--;
    }
    $llllllllllll='chr'; //here
    $lllll=0;
    $lllllllll="?".$llllllllllll(62); //here
    $llllllllll="";
    for(;$lllll<$lllllll;){
        $llllllllll.=$llllllllllll($llllllll[$lllll++]^0x07);
    }
    $lllllllll.=$llllllllll.$llllllllllll(60)."?"; //here
    eval($lllllllll);
?>
 

Next we perform the variable transformations that the first few eval() calls call for. Please note that towards the bottom there are a few variables we can't simply replace just yet.

 
<?php
 
    $lll=0;
    $ll=0;
    $llll=0;
    $lllll=3;
    $l=base64_decode($o); //Leaving this since I only want it to evaluate once
    $lllllll=0;
    $llllll=(ord($l[1])<<8)+ord($l[2]);
    $lllllllll=16;
    $llllllll="";
    for(;$lllll<strlen($l);){
        if($lllllllll==0){
            $llllll=(ord($l[$lllll++])<<8);
            $llllll+=ord($l[$lllll++]);
            $lllllllll=16;
        }if($llllll&0x8000){
            $lll=(ord($l[$lllll++])<<4);
            $lll+=(ord($l[$lllll])>>4);
            if($lll){
                $ll=(ord($l[$lllll++])&0x0f)+3;
                for($llll=0;$llll<$ll;$llll++)
                    $llllllll[$lllllll+$llll]=$llllllll[$lllllll-$lll+$llll];
                $lllllll+=$ll;
            }else{
                $ll=(ord($l[$lllll++])<<8);
                $ll+=ord($l[$lllll++])+16;
                for($llll=0;
                    $llll<$ll;
                    $llllllll[$lllllll+$llll++]=ord($l[$lllll]));
                $lllll++;
                $lllllll+=$ll;
            }
        } else
            $llllllll[$lllllll++]=ord($l[$lllll++]);
        $llllll<<=1;
        $lllllllll--;
    }
    $lllll=0;
    $lllllllll="?".chr(62); //Leaving as we can't evaluate yet.
    $llllllllll="";
    for(;$lllll<$lllllll;){
        $llllllllll.=chr($llllllll[$lllll++]^0x07);
    }
    $lllllllll.=$llllllllll.chr(60)."?"; //Leaving as we can't evaluate yet.
    eval($lllllllll);
?>
 

Now, let's make it readable by translating all those $l variables into english letters. First one will be $a, next $b, etc etc. Alright, I missed one and had to go back, but here it is.

  1.  
  2. <?php
  3.  
  4. $a=0;
  5. $b=0;
  6. $c=0;
  7. $d=3;
  8. $l=base64_decode($o); //leaving this as $l
  9. $e=0;
  10. $g=(ord($l[1])<<8)+ord($l[2]); //whoops, skipped one...
  11. $f=16;
  12. $h="";
  13. for(;$d<strlen($l);){
  14. if($f==0){
  15. $g=(ord($l[$d++])<<8);
  16. $g+=ord($l[$d++]);
  17. $f=16;
  18. }if($g&0x8000){
  19. $a=(ord($l[$d++])<<4);
  20. $a+=(ord($l[$d])>>4);
  21. if($a){
  22. $b=(ord($l[$d++])&0x0f)+3;
  23. for($c=0;$c<$b;$c++)
  24. $h[$e+$c]=$h[$e-$a+$c];
  25. $e+=$b;
  26. }else{
  27. $b=(ord($l[$d++])<<8);
  28. $b+=ord($l[$d++])+16;
  29. for($c=0;
  30. $c<$b;
  31. $h[$e+$c++]=ord($l[$d]));
  32. $d++;
  33. $e+=$b;
  34. }
  35. } else
  36. $h[$e++]=ord($l[$d++]);
  37. $g<<=1;
  38. $f--;
  39. }
  40. $d=0;
  41. $f="?".chr(62); //here
  42. $i="";
  43. for(;$d<$e;){
  44. $i.=chr($h[$d++]^0x07);
  45. }
  46. $f.=$i.chr(60)."?"; //here
  47. eval($f);
  48. ?>
  49.  

At this point its almost readable, although astute observers will have noted one variable that's weird. $o is wrapped in a base64_decode call, and bound to $l. $o is, shockingly enough, 44K of... packed binary. I didn't notice this until someone else pointed it out, I just glossed over all that junk and had started on the obvious eval command. To get a good view of it, try here.

Line 9 of the above can be simplified, so let's do that and turn it into the following (just to get an idea of the exact numbers we can). We can also run the strlen() call and learn that the length of $l is 33222! Geez... So let's replace that in the for loop on (old) line 12. $e after the massive loop becomes 83760, we can replace that also. And finally I replace the chr() calls towards the end and just put them into the strings that they're concatted to.

 
    $ga = ord($l[1])<<8; //48
    $gb = ord($l[2]); //0
    $g = $ga + $gb; //48
    ....
    for(;$d<33222;){
    ....
    $f="?>";
    ....
    for(;$d<83760;){
    ....
    $f.=$i."<?";
 

So, where does this leave us? We have a nice little (readable) script that does (something) on a HUGE bit of packed binary. Now, if you look at the for loop there, its not going to be fun to step through. Our index goes up to 33222, and is incremented by at most 6 every iteration. At worst you'll be stepping by 1. Well, the eval() command isn't until the end, so lets just see what that nasty big loop spits out, hmm? As long as you die() before the eval command, you'll be alright. What this spits out is a nasty huge array...

 
Array
(
    [0] => 59
    [1] => 56
    [2] => 119
    [3] => 111
    [4] => 119
    [5] => 13
    //....<snip 83000ish lines>
    [83754] => 97
    [83755] => 60
    [83756] => 13
    [83757] => 13
    [83758] => 56
    [83759] => 57
)
 

Now that's just lovely. Hmm, but looking at the values it gives me an idea. I decided to spit out an array of the values from the right side, and how many times they're referenced.

 
Array
(
    [59] => 93
    [56] => 1
    [119] => 1110
    [111] => 2748
    [13] => 630
    [36] => 70
    [35] => 349
    [99] => 1773
    [98] => 8490
    [101] => 1112
    [114] => 3197
    [96] => 1525
    [39] => 7322
    [58] => 227
    [54] => 153
    [60] => 577
    [107] => 3064
    [110] => 3057
    [105] => 4299
    [108] => 1170
    [102] => 4179
    [125] => 766
    [51] => 94
    [104] => 2476
    [116] => 3241
    [115] => 4404
    [88] => 414
    [84] => 541
    [66] => 354
    [85] => 163
    [81] => 130
    [92] => 19
    [32] => 1487
    [79] => 425
    [83] => 386
    [87] => 187
    [72] => 101
    [90] => 19
    [97] => 635
    [47] => 127
    [46] => 127
    [100] => 2031
    [37] => 305
    [61] => 391
    [41] => 886
    [117] => 4419
    [57] => 100
    [86] => 2
    [82] => 79
    [78] => 59
    [106] => 1421
    [69] => 438
    [112] => 1295
    [42] => 688
    [126] => 198
    [40] => 1630
    [53] => 110
    [52] => 82
    [55] => 105
    [63] => 30
    [67] => 209
    [74] => 153
    [50] => 74
    [48] => 37
    [62] => 27
    [113] => 313
    [89] => 3
    [43] => 3496
    [70] => 297
    [73] => 169
    [68] => 172
    [80] => 271
    [14] => 1161
    [65] => 243
    [76] => 337
    [127] => 186
    [95] => 10
    [93] => 47
    [64] => 213
    [33] => 442
    [75] => 103
    [49] => 34
    [118] => 5
    [77] => 36
    [109] => 41
    [124] => 17
    [122] => 17
    [44] => 10
    [34] => 21
    [45] => 9
    [123] => 5
)
 

The minimum is 13, and the max is 127, so these do fit nicely on an ASCII table. Now, 13 is a LOT smaller than I expected (down at carriage return), but such is life. Sorted it looks a bit better, but doesn't map nicely to the ASCII values I'd hoped to see the most of.

 
Array
(
    [13] => 630
    [14] => 1161
    [32] => 1487
    [33] => 442
    [34] => 21
    [35] => 349
    [36] => 70
    [37] => 305
    [39] => 7322
    [40] => 1630
    [41] => 886
    [42] => 688
    [43] => 3496
    [44] => 10
    [45] => 9
    [46] => 127
    [47] => 127
    [48] => 37
    [49] => 34
    [50] => 74
    [51] => 94
    [52] => 82
    [53] => 110
    [54] => 153
    [55] => 105
    [56] => 1
    [57] => 100
    [58] => 227
    [59] => 93
    [60] => 577
    [61] => 391
    [62] => 27
    [63] => 30
    [64] => 213
    [65] => 243
    [66] => 354
    [67] => 209
    [68] => 172
    [69] => 438
    [70] => 297
    [72] => 101
    [73] => 169
    [74] => 153
    [75] => 103
    [76] => 337
    [77] => 36
    [78] => 59
    [79] => 425
    [80] => 271
    [81] => 130
    [82] => 79
    [83] => 386
    [84] => 541
    [85] => 163
    [86] => 2
    [87] => 187
    [88] => 414
    [89] => 3
    [90] => 19
    [92] => 19
    [93] => 47
    [95] => 10
    [96] => 1525
    [97] => 635
    [98] => 8490
    [99] => 1773
    [100] => 2031
    [101] => 1112
    [102] => 4179
    [104] => 2476
    [105] => 4299
    [106] => 1421
    [107] => 3064
    [108] => 1170
    [109] => 41
    [110] => 3057
    [111] => 2748
    [112] => 1295
    [113] => 313
    [114] => 3197
    [115] => 4404
    [116] => 3241
    [117] => 4419
    [118] => 5
    [119] => 1110
    [122] => 17
    [123] => 5
    [124] => 17
    [125] => 766
    [126] => 198
    [127] => 186
)
 

Thankfully we're not done yet, notice that these values are called along with a ^0x07... so maybe if we change the list to reflect this operation it'll look better. In order to make it tidy, and to continue the assumption, I've substituted the character for the integer.

 
Array
(
    [<tab>] => 1161
    [
<line feed>] => 630
    [ ] => 7322
    ["] => 305
    [#] => 70
    [$] => 349
    [%] => 21
    [&] => 442
    ['] => 1487
    [(] => 127
    [)] => 127
    [*] => 9
    [+] => 10
    [,] => 3496
    [-] => 688
    [.] => 886
    [/] => 1630
    [0] => 105
    [1] => 153
    [2] => 110
    [3] => 82
    [4] => 94
    [5] => 74
    [6] => 34
    [7] => 37
    [8] => 30
    [9] => 27
    [:] => 391
    [;] => 577
    [<] => 93
    [=] => 227
    [>] => 100
    [?] => 1
    [A] => 297
    [B] => 438
    [C] => 172
    [D] => 209
    [E] => 354
    [F] => 243
    [G] => 213
    [H] => 425
    [I] => 59
    [J] => 36
    [K] => 337
    [L] => 103
    [M] => 153
    [N] => 169
    [O] => 101
    [P] => 187
    [Q] => 2
    [R] => 163
    [S] => 541
    [T] => 386
    [U] => 79
    [V] => 130
    [W] => 271
    [X] => 10
    [Z] => 47
    [[] => 19
    []] => 19
    [^] => 3
    [_] => 414
    [a] => 4179
    [b] => 1112
    [c] => 2031
    [d] => 1773
    [e] => 8490
    [f] => 635
    [g] => 1525
    [h] => 2748
    [i] => 3057
    [j] => 41
    [k] => 1170
    [l] => 3064
    [m] => 1421
    [n] => 4299
    [o] => 2476
    [p] => 1110
    [q] => 5
    [r] => 4419
    [s] => 3241
    [t] => 4404
    [u] => 3197
    [v] => 313
    [w] => 1295
    [x] => 186
    [y] => 198
    [z] => 766
    [{] => 17
    [|] => 5
    [}] => 17
)

This looks good to me. notice the matched braces (17 each), matched parens (127 each), and almost matched carrots (93 and 100). Also take note that there's much more lowercase than uppercase letters, and that lowercase 'e' is far above the other letters, implying this has English words (or at least English something) in it.

And again we end up with a nasty large for loop, completely worthless to try to step through:

 
    for(;$d<83760;){
        $i.=chr($h[$d++]^0x07);
    }
 

83760 steps in that loop, building whatever it is we're going to eval at the end char by char! I'm going to cheat again, run the loop, spit out the result, and die before the eval. Looks like 632 lines of... more php! Yay, this time its not obfuscated!

It starts off with 410 lines of 4 arrays of URLs, most appearing German. After that it does some work on your domain name and the requested page, then does some work on all of the above to pull out specific URLs to seed your footer with. They were nice enough to include a $debug flag, which I turned on, and ran on a different site that I wasn't too worried about. Here's the spit out:

 
host: thievestavern.com
uri: /final_code.php
numbers: 0
 
final_code.php
url: thievestavern.com/final_code.php
url_base64: dGhpZXZlc3RhdmVybi5jb20vZmluYWxfY29kZS5waHA=
url_zahl: 20295
 
array_a_count: 22
array_b_count: 90
array_c_count: 87
array_d_count: 145
array_e_count: 2
url_count: 32
 
crc32 url: -1485503542
intval(crc32 count): 1485503542
 
count_a (url_count + numbers): 03542
 
count_a % array_a_count: 03542 % 22
count_a (Rest nach Teilung): 0
 
count_b (url_count + numbers): 03542
 
count_b % array_b_count: 03542 % 90
count_b (Rest nach Teilung): 32
 
count_c (url_count + numbers): 03542
 
count_c % array_c_count: 03542 % 87
count_c (Rest nach Teilung): 62
 
count_d (url_count + numbers): 03542
 
count_d % array_d_count: 03542 % 145
count_d (Rest nach Teilung): 62
 
count_e (url_count + numbers): 03542
 
count_e % array_e_count: 03542 % 2
count_e (Rest nach Teilung): 0
 
url_zahl % anchor_1_count: 20295 % 6
 
url_zahl_rest_1: 3
anchor_1_count: 6
 
url_zahl_rest_2: 2
anchor_2_count: 7
 
url_zahl_rest_3: 5
anchor_3_count: 10
 
url_zahl_rest_4: 0
anchor_4_count: 11
 
url_zahl_rest_5: 2
anchor_5_count: 7
 
count: 03542
//Links were spit out here... I'm not including them as
//I don't want to advertise for random people
 

Conclusions
I'm considering this question solved. The obfuscated piece of junk that was residing in my footer turns out to be nothing more than a very paranoid way to put links into the footer. Its handy as it'll be hard to automatically figure out which sites have these spam footers (if I couldn't do this), it also intimidates the normal user to not remove the links. This is fairly clever, and involved, and a nice feature is that they can simply change the $o to get a whole new set of links to spit out, or to change the footer's design. I'll give them props for that.

All of the above makes me say that themespack.com isn't really interested in putting good free (as in freedom, and beer) themes into the hands of bloggers, they're more interested in themselves. Can I fault them? No. However, I do take issue with trying to hide behind 'copyright' to prevent me from changing the links on my own blog.

If you get one of these themes I'd highly recommend you do the following: Go into the footer, and remove the 'alignleft' div. Put in a 'theme created by ' link, and then give yourself credit for editing it. As is your right under the GPL. I'd love to say to run the script and give credit to the supposed creators, but those end up being random German companies, clearly not people in the business of writing PHP.

Oh, and for the record, all of the above is what the server does when it runs the PHP code, so I'm not reverse engineering it anymore than a few calls to base64_decode does.

Share and Enjoy:
  • Slashdot
  • del.icio.us
  • digg
  • Technorati
  • Facebook