Tshark to the rescue

In my line of work as a sysadmin at one of the largest sites in Norway, it happens once in a while that I have to inspect HTTP traffic for some more or less urgent reason.

wireshark

One of the tools I really love working with is tshark.
Tshark is the console version of wireshark and enables you to sniff and dissect just about any protocol in realtime.

One of the problems I had recently was to identify webtraffic originating from our webserver.
Over the years code has accumulated server initiated fetches. Stuff like file_get_content("http://somesite/someurl) in the presentation code. This is bad since it creates external dependencies to deliver a page and keeps apache/nginx/lighttd threads/processes busy

tshark -i eth0 -n -aduration:60 -zhttp,tree -zhttp_srv,tree -T fields -e http.host -e http.request.uri -e http.request.method -R http -tad 'src host 10.0.0.144 and (dst port 80 or dst port 443)'

This roughly says:

Listen on the eth0 interface for 60 seconds.
Write out two different sets of statistics about the traffic.
Write out the "Host:" header, the URL and the request method. (GET/POST).
Try to interpret the traffic as a HTTP.
write timestamps in a readable format (not used)
Only look at traffic from my IP to port 80 (HTTP) and port 443 (HTTPS)

This little trick helped me identify loads of external dependencies and pinpointed some ugly code that needed some care.

And while I was at it. I figured out I could do something similar with mysql queries. Instead of turning on full Query-logging in mysql (which probably means a restart of a running production mysql) I could just sniff it

tshark -i eth0 -aduration:60 -d tcp.port==3306,mysql -T fields -e mysql.query  'port 3306'

Which roughly says:

Listen on eth0 for 60 seconds
Interpret port 3306 as mysql
write out queries
Only look at traffic on port 3306

Have fun.

Other usefull options to -T fields -e

http.response.code  
http.server  
http.content_type  
ip.src  
ip.dst  
tcp.port  
http.user_agent