Dec 19, 2016

Quickly grasp what an Autonomous System is up to

With some filtering, the solution to an issue was presented to me in a obvious way.

Online Databases publishing AS (Autonomous System) IP Blocks, rgxg cidr - a tool to convert these blocks into greppable regex - , some bash loop to combine them and finally goaccess, a ncurses tool to aggregate information gathered from accesslogs. Last one takes input from a pipe as well, so it integrated nicely into quick questions posed further up the filtering pipeline.

Copy paste list of IP Blocks into a file:

for ipblock in $(cat list); do
    echo -n "($(rgxg cidr $ipblock))|";
done | sed 's/|$//g'

and zgrep -h -E <insert generated regex> accesslogs.*.gz | goaccess -r --log-format=COMBINED --date-format='%d/%b/%Y' --time-format='%H:%M:%S' -

Lazy extra, get a valid IPv4-regex quickly: rgxg cidr


Also usable to egrep the blocks from a curl to a Online Service containing the blocks, add notation for block size (/24).

So what did it help me with? Most Ebay auction tools apparently didn't follow 1-hop redirects for to be uploaded image urls when listed programmatically. Seeing the whole AS of Ebay on the image url paths with only one HTTP 200 on every three 301s it seemed clear the links can't be handed over in a manner that would trigger a redirect (in path or schema). You could have seen this behaviour from only a handful of requests, but the 1:3 distribution on a few thousands made it obvious.

Another possible explanation is that some ebay crawlers were based on java 1.6 at that time and couldn't negotiate the ssl connection. But as far as I remember the host didn't have a modern tls config.



for ipblock in; do
  echo -n "($(rgxg cidr $ipblock))|";
done | sed 's/|$//g'