Entropy analysis

I wanted a tool that does entropy analysis using Shannon entropy, so I wrote one. Shannon is a measure of uncertanity. It is denoted in math as the greek letter Eta, it has an expected value of E[I(X)] where I(X) aka information content is -ln(P(X)) when working on a finite sample of {x1,x2,x3,x4...xi} we can calculate H(X) by


This helps me to find random blocks/chunks in binaries or data. To determine packed data, encrypted blocks, interesting things. The values of the entropy goes from 0 to 8 where 0 (small entropy) is very not random ^_^ and 8 (large entropy) is very random. I wrote the code in Go[1] lang.

For example. I have two binaries one that is unpacked and one that is packed. Let us see how their entropy graphs looks like.

This will generate two graphs of 32 blocks with suspecious line at entropy 6.5 to determine from the graph anything that goes above it so we know where to look in the binary. It will also list the suspecious blocks of entropy 6.5 and above in stdout. bin.unpacked has an entropy of 5.909006 while bin.packed has an entropy of 7.87337 this tells us which one is packed and which one isn't. Looking at the graphs makes it more clear.

bin.unpacke graph


bin.packed graph


looking at the graph of bin.unpacked we can see that it's a normal elf file that starts with very small entropy then it goes up and stays at around 4 and with no spikes or anything irregular, but for bin.packed we see that it has a high entropy in general except at the very beginning where it contains the "unpacking code" the rest is just packed code with a specific pattern. The binary was packed with UPX. This information can be useful while working on firmwares, unknown data formats, or generally to determine random or packed code inside a binary or to find cryptography keys that have large entropy ..etc

