Walking websites with httpscreenshot

Overview:

As part of reconnaissance stage of a pentest, you may wish to capture home pages of an organizations’ websites. One option to do just that is HTTPscreenshot. HTTPscreenshot has been touted to be a tool for both red and blue teams. This tool was released at SchmooCon 2015, developed by Justin Kennedy and Steve Breen.

Setup:

This guide was written using a Debain 7.8 Virtual Machine. Debain/Ubuntu based operating system is recommended as apt-get commands are part of the install script.

    1. Install git if you haven’t done so already.
    2. Download the source code from GitHub:
    3. Install the dependencies using the included shell script.
    4. Note, swig3.0 could not be found at the time of this writing. I manually installed Swig with apt-get and removed swig and swig3.0 from the install script.
    5. Create a flat file using vi or nano with a list of websites you would like to have scraped.
    6. Websites that are scraped include a png and an html file which can be used to grep through for specific content.
    7. My first attempt was to scrape Google.

Screen Shot 2016-05-21 at 9.19.40 PM

A full usage list is also provided for reference here:

Bonus Content:

I tried my hand at getting HTTPscreenshot to run on CentOS version 7. It took a bit of trial and error, but I am satisfied with the result. The following steps assume a base install of CentOS 7 64-bit.

  1. Install git and download the source script
  2. The core script is written in python, so we need to install the required python libraries. Install epel-release and perform a repo refresh.
  3. Also, install development headers for various libraries and a complier.
  4. Next, install pip a package manager for Python as well as Swig and OpenSSL (required for M2Crypto).
  5. Python packages to make HTTPscreenshot go are installed next.
  6. Lastly, download and extract phantomjs
  7. Now its time to scrape some websites. I will default to the old standby of Google.

Screen Shot 2016-05-21 at 9.03.08 PM

Conclusion:

HTTPscreenshot is a powerful tool to perform information gathering in a more automated fashion. I recommend following both Justin and Steve on Twitter.

Leave a Reply

Your email address will not be published. Required fields are marked *