WiFi networks and malware epidemiology
Posted on April 5, 2009 by Jorrit Kronjee
Filed Under malware, router, wireless | Leave a Comment
Reading up on security news, I found this paper about WiFi networks and malware epidemiology. Although I know jack all about epidemics and their infection rates, I do know a thing or two about wireless routers and their security, so I thought I should put my two cents in.
The paper states that malware could theoretically propagate from router to router using ad-hoc wireless networks. This is based on the following premises. Since most routers still use no or limited encryption (e.g. WEP), building an ad-hoc wireless network between two routers should be possible. Furthermore, most users don’t change the default administrator password, making the actual malware infection feasible.
The paper then goes on how a malware infection could spread in a densely populated area like Chicago, where many wireless networks overlap each other. It ends with the following conclusion:
“Based on this work, we note that there is a real concern about the wireless spread of WiFi based malware. This suggests that action needs to be taken to detect and prevent such outbreaks, as well as more thoughtful planning for the security of future wireless devices, so that such scenarios do not occur or worsen with future technology. (…) Lastly, it is highly likely that we will only see the proliferation of more wireless standards as time goes by, and all of these standards should consider the possibility of such epidemics.”
And this is where I disagree. Yes, it’s true that routers can be infected (I can’t disagree even if I would want to, not since the discovery of psyb0t) and yes, that might happen through an ad-hoc wireless connection. The practicality however is an extremely complex ordeal.
First of all, wireless routers can’t make ad-hoc connections. I’m sure their hardware supports it, but the firmware doesn’t and why would it? Your laptop/computer/Nintendo Wii is supposed to make the connection to the router, not the other way around.
So, the malware is going to have to be equipped with some code to make a connection. Maybe in some cases they left a few hooks available in the firmware that can be abused, but a lot of the times you’re going to have to write parts of the driver yourself. And not every wireless router uses the same chipset. Unless you’re targetting a specific regional ISP that has standardized on a certain type of wireless router (and don’t forget, the maximum level of encryption it’s allowed to use is WEP for this to work!), you might be out of luck.
Second, wireless routers don’t all use the same hardware. Sure, more and more routers are MIPS-based and run some form of Linux with busybox, but it’s certainly not standardized. You could fix this by making your malware statically compiled, but you can’t make your malware too big, since the available flash memory on a typical router is limited to a few MB’s.
Lastly, why would you use the wireless network as a point of entry? psyb0t certainly proves that using the Internet to infect routers is certainly feasible and it allows wireless routers that are not in the vicinity of each other to propagate the malware. Using the wireless connection just makes it harder without any benefits.
In conclusion, I believe that the epidemic as described is very unlikely to occur. More and more routers have WPA-encryption set as default. Those that don’t have no standards to rely on, making malware hard to program and therefor less dangerous and less contagious. While I do believe that router infections will become more common in the future, I don’t think wireless networks will be used for propagation, at least, not anywhere in the foreseeable future.
These were my two cents.
phentropy revisited
Posted on February 17, 2009 by Jorrit Kronjee
Filed Under RNG, pattern analysis | 1 Comment
I’ll have to admit something. The title of this post is a little bit misleading. This won’t be exactly like Dan Kaminsky’s phentropy pictures, neither is it as awesome as Michal Zalewski’s research on TCP/IP sequence numbers. However, it is a study of RNGs. RNGs (Random Number Generators) – sometimes called PRNGs (Pseudo RNGs) to specify the non-randomness – are algorithms that create random numbers. These random numbers can be used for a multitude of things like shuffling your playlist or making the Tetris brick that looks like a hook appear, but also cryptography. In that case, random numbers are used to generate random keys.
Okay, so what did I do? Well, I didn’t quite study the RNGs used in popular encryption software nowadays. Instead I’ve decided to just look at general purpose RNGs and see what I could get. Hopefully this study will spur some ideas in the future.
To study them I took 50,000 samples from a few general purpose RNGs. For every RNG, this resulted in a sequence an. I then used the following equations to determine the xyz-coordinates and plotted these points in a 3D space with gnuplot.



The purpose of 3D plotting is to visualize how random a RNG can be without having to see the algorithm behind it. I realize that this is not all and that a lot more needs to be verified before someone can truly say that it’s cryptographically strong enough, but let’s save that for another post.
So first up is the Mersenne Twister. For this test, I’m using the PHP implementation of the twister, namely mt_rand(). This produced the following image.
The different colors specify the amount of clustering in a certain area. I used the following equation to group points together (I know this is essentially cheating a little):

(where m is an integer number)
Yellow means the most dense and red means the least dense. Anything in between is orange. As you can see, the twister produced an even distribution of points within the cube.
Next one is bash’s $RANDOM variable. I don’t really know how $RANDOM gets its value, because I haven’t looked at bash’s code, but my guess is that $RANDOM gets filled by subsequent calls to rand(). Again, I retrieved 50,000 samples and plotted them in a cube. Although the scale is a lot smaller (maximum value is 32,767), it still shows a pretty even distribution of values. Nothing spectacular there.
So, hoping to get more interesting results, I tried Windows XP’s %RANDOM%, which can be used in batch scripts since Windows 2000.
But again, an equal distribution of numbers, so no dice.
I got a little bit tired of looking at the same pictures over and over again, so I started looking for trouble and found a couple of RNGs that were known to be bad. One of them is John von Neumann’s Mid-square method. The method is pretty simple; take a seed, do seed2, take x number of digits from the middle and use that as the result. The new seed will be the current result. 50,000 samples later this picture came to life:
Although the mid-square method does generate seemingly random numbers, it has one big flaw, once the seed reaches zero, the method will output zeros forever. This is why, in the cube above, you see only red dots and one big yellow dot at (0,0,0).
The last one I wanted to try was RANDU. RANDU was originally designed with simplicity in mind, which is why the following equation can be implemented using just a handful of bit manipulation operations:

So, again, I generated 50,000 samples … yadiyadiyada, and then this picture appeared:
At a first glance RANDU seems pretty random. An equal distribution of numbers and aside from there being a lot more yellow dots, I wouldn’t say it was flawed. But then, I turned the image around:
As you can see, instead of being scattered RANDU generated 15 planes in a 3D space. Not surprisingly, RANDU is – to my knowledge – not used as a randomizer anymore as this is simply not random enough. Or as Donald Knuth puts it: “…its very name RANDU is enough to bring dismay into the eyes and stomachs of many computer scientists!” Oh, that Donald…
After having done all this, I feel this was all for naught. Although I have learnt a new way to look at RNGs, I was not really able to uncover anything we didn’t already know. During the tests I did find other new ways to test RNGs and perhaps next time I will apply these techniques to these RNGs, but for now this will be it.
phpbb.com hacked and the usage of passwords
Posted on February 7, 2009 by Jorrit Kronjee
Filed Under in the media, passwords | Leave a Comment
It’s been a while since I last posted, but to be quite honest, I don’t think there was much to post about anyway. Of course there’s always something going on; exploits and new techniques get released everyday, but nothing to get too excited about. My little project has come to a grinding halt, mainly because I didn’t have the time (not because I was busy, no, of course not). Happy new year, by the way! In other news, the dynamic duo Lee & Louis are still not talking and Australia is still going to get a firewall. Oh well…
There’s been a lot of chatter about phpbb.com being hacked. Unlike other hacks, the hacker actually took the time to make a write-up on how he did it, as well as post a part of the user database including decrypted passwords and all the e-mail addresses from their mailing list.
What I thought was interesting about it, is not so much the hack itself – because that was borderline script kiddie stuff – but how people reacted to it. “These guys spend time developing and support a free product in their own time and you repay them by hacking their site and publicly releasing their personal information?” or “Seriously, who the hell is so harmed by FOSS that a FOSS project deserves this? What’s next? Releasing the SSN’s of blood donors or 9/11 rescue workers?” Apparently hacking a FOSS website is more despicable than hacking for example Microsoft. As if that is for some reason more noble.
I like to perceive these things with a sense of amorality; websites are going to be hacked whether I post enraged comments about them or not. The question is: why do they get hacked? In the case of phpbb.com, it was pretty straightforward. The hacker used an PHPList exploit from milw0rm.com and was able to run his own code shortly after. And this is how it usually happens.
What strikes me about this, though (and about any hack for that matter), is the fact that there were no passwords used. No passwords. Within the world of IT security it’s the main thing everyone always fuzzes about; users leaving their passwords on postit notes, users choosing silly passwords like “123456″ or “password”, users giving up their password for a chocolate bar etc. I’ll admit that not having a password at all is a little bit silly, but you start wondering if it’s all that wrong when the application itself is usually less secure.
Robert Graham did a statistical analysis on the password database the hacker salvaged and not to anyone’s surprise, many of them were based on dictionary words. But what does this really mean? That these accounts were on the verge of being exposed anyway? It all sounds very doubtful.
This whole paradigm of picking non-dictionary passwords is as old as myself. Back in the early days, people used password crackers like John the Ripper with a dictionary of common passwords to figure out the contents of /etc/shadow. As CPUs became faster, the dictionaries became bigger and soon enough people had to start using random passwords like “80Br17SX” to be really safe.
But then rainbow tables came along and now even random passwords aren’t safe anymore. So we’re left with trying to remember all our passwords (and being reprimanded by websites if we don’t) or having to store them in a password manager, which in turn we have to secure with a master password in order to prevent us from being hacked.
So if passwords are not going to cut it anymore, what do we have left? Biometric identification? Pfff…
I thought about ending this post by saying that people should just pick whatever password they want, but I strongly believe that using a random password does help. It’s just not as helpful as everyone likes you to think.
Pattern analysis of wireless communications (part II)
Posted on November 30, 2008 by Jorrit Kronjee
Filed Under pattern analysis, wireless | Leave a Comment
I’ve made some progress in the research I’ve been conducting, which I thought I should share. In the last post I stated that every traffic type has an almost unique shape once put in a bandwidth graph. I also made a couple of graphs to prove it. I would now like to explain a little bit further what we can do with this information.
A bandwidth graph is basically made out of two components, time and speed. Speed is made out of two components as well, namely size and again, time. Size can be any number of packets. If you say for instance something has a speed of 100 kbps at a given moment in time, that speed could be made out of packets of 1500 bytes in size with an average speed of 8.33 packets/second, but it could just as well be packets of 100 bytes in size with a speed of 125 packets/second. That is, however, not very logical, because the more packets are sent, the more packet overhead there will be, the less efficient a protocol becomes. So, a protocol will try to be as efficient as possible without hurting its functionality.
To put it in other words, depending on its needs, a protocol favors certain packet sizes over others. VoIP, for instance, uses small packet sizes, because it’s a continuous stream of small data (like speaking is a continuous stream of sound), while HTTP or e-mail use big packet sizes, because they generate bursts instead of streams.
I’ve been playing a little bit with the data I captured last time to see how this statement holds up, when we focus on just the packet size instead of bandwidth usage (graphs are made with gnuplot).
So here we have a graph of all the packet sizes we’ve encountered during the HTTP session of last time. Keep in mind that the Y-axis has a log scale. The reason why I used this, is because I’m also interested in packet sizes that only appear once. And, as you can see, there are quite a few of those. Most of them seem to cluster around a 1000 bytes. So, would it make sense to produce a histogram out of a clustered version of the data? I would say so. Not just to visualize better what’s going on, but also, when we’re going to apply the things we learn now to encrypted traffic, we have to take clustering into account as well. After all, AES clusters data by encrypting blocks 128 bits at a time, padding data if necessary. So packets encrypted by AES come in steps of 128 bits.
After clustering the data in steps of 100 bytes and removing the log scale, the histogram looks kind of like this:
(Keep in mind that these graphs show traffic captured on an Ethernet network. As such, packets have a 14 bytes Ethernet header added to them)
You can see that HTTP clearly favors big packet sizes from the server side, while the client side seems to be just ACK’ing the data that it’s getting. The cluster we saw earlier is still there (at 901 – 1000) and seems to be mostly coming from the client. Looking through the capture file I noticed that the GET requests are almost a 1000 bytes in size. Since most webpages cause multiple GET requests these days, this would explain the cluster there.
To further prove my point, let me show you the graphs for the Skype traffic I captured the last time:
Wow, that’s quite colorful. You can see that Skype sends packets that are almost the same size, but not quite. Displaying this as a histogram will certainly be helpful.
That is better. You can see now that Skype generates a lot of small packets and because both sides were talking, the amount of packets is almost symmetrical.
So, now my point is proven, where do we go from here? We’ve shown that packet sizes correlate to protocol characteristics just as much as bandwidth graphs do. With this in mind, we should be able to guess the protocol, just by looking at its packet sizes. And Charles V. Wright did just that, according to this paper.
I’ve been reading through the paper a couple of times and from what I understand now is that he uses Hidden Markov Models to show the relationship between packet sizes and protocols. Using HMM he can accurately estimate the protocol. I’m not going to try to explain it myself right now, but I have already made some progress in trying to make a working application of this idea. Of course, I will share this with you as soon as I have it running.
Pattern analysis of wireless communications
Posted on November 9, 2008 by Jorrit Kronjee
Filed Under pattern analysis, wireless | 2 Comments
Although it’s not much, I thought I should post an update on what I’m working on. At the moment, I am working on pattern analysis of wireless communications. Or more specifically, encrypted wireless communications.
Every traffic type (f.e. HTTP, e-mail, IM or VoIP) has certain characteristics in bandwidth usage. For instance, HTTP is characterized by short bursts of traffic, usually consisting of large data packets (close to or at the MTU size), while VoIP is more of a stream of small packets (so lots of packets, but a small payload).
Because every traffic type behaves a little bit differently, one could guess what kind of traffic is going over the wire, just by looking at the bandwidth graphs.
The blue line represents outgoing data (request for the webpage and a few TCP ACKs), the red line represents the incoming data. The y-axis is in Bps. You can clearly see the short bursts in traffic, as I move from one article to another.
The graph for Skype is a little bit different. As you can see there’s a constant stream of data, until I hang up (~55s). Interesting to see is that when the Skype Test Service asks me to speak, you see a sudden drop in traffic (red line, ~12s). I suspect Skype uses something like VAD (Voice Activity Detection) to keep bandwidth usage low.
What I wanted to know however, how this traffic over a wired connection compares to traffic over a wireless encrypted connection. To test this, I needed a different setup.
In cryptography Alice, Bob and Eve the Eavesdropper are often used in examples to clarify data transmission. My Alice and Bob are my desktop computer and my wireless router, a Linksys WRT54G. Because my desktop computer doesn’t have a wireless card, I decided to use my HTC TyTN II as a mediation device. This shouldn’t interfere with the test.
Eve is my laptop with an Intel 3945ABG wireless chipset, using the ipwraw module for raw capture. For the actual capture, I used airodump-ng.
My Linksys WRT54G is setup to use WPA-PSK AES. To smoothen out the graphs a little bit I removed the beacon frames from the captures, as they are of no relevance for the test.
First, I started browsing news articles again:
You can clearly see the same kind of bursts as before. Based on this graph, it’s hard to say if WiFi generates any overhead (although I know it should). The next graph should make this more clear:
This graph also follows the same pattern as before. You can recognize the moment when the Skype Test Service is silent (~20s). The graph also shows that more bytes are being transmitted, which is mainly caused by the usage of WiFi.
The thing that makes me curious though is why the graph is a bit more jagged than the one before. Is this caused by the ipwraw module (which has been acting kind of flaky) or is this just the nature of WiFi? It’s going to require more extensive testing to conclude anything from it.
What I’m finally hoping to achieve is a scenario where, based on a packet capture, I can conclude what kind of traffic was most likely transmitted. There has been research on this in the past, but I’ve found nothing specifically targeting wireless connections.
So far it does look promising though. Hopefully, I will find the time next week to look into a couple of things like finding a better plotter (the graphs above were made with Wireshark) and gaining more indepth knowledge of wireless protocols.
keep looking »













