SEARCH: NAVIGATION:

WebPageInspector

N U T S H E L L
Product	WebPageInspector
Description	Dissects a web page into components
Language	Java
Interface	GUI
Platform	any Java-supported (Windows,Linux,...)
Reference	API

Web applications are strange creatures. Only when you can begin to realize that you are not truly a master of all the intricacies of the myriad components that comprise a web application are you on the path to enlightenment. If you are a master of HTML, that's fine... until you need to interact with a server. If you eat the nuances of Java Server Pages for breakfast, that's great... until you have load balancing issues to unravel. If you can make whiz-bang client-side scripts that make buttons on your web page dance, wonderful... until your cookies are misbehaving. HTML, XHTML, DHTML, XML, XSL, XPath, JScript, JavaScript, Java, JAXP, CGI, JSP, ASP, SQL, HTTP, CSS, URL, URI, W3C, ActiveX, JRE, ... these are just some of the interacting technologies that one must understand to get a handle on how quirky web pages can be.

So along comes WebPageInspector. Do not be under any illusions--this is not a panacea for anything. Not even close. What it is, however, is one tool to add to your arsenal that can assist in monitoring, diagnosis, and ultimately understanding of a WWW transaction.

The thumbnails here illustrate the views into a web page that WebPageInspector offers. Click on any illustration to see the full-size representation.

WebPageInspector allows you to fetch a URL and breaks it down into components, each component available on a separate tab in the interface. Most of the tabs are shown on this page for a popular search engine, www.alltheweb.com. Several of the tabs provide information about the web page, including the server connection, the parts of the URL itself, the HTTP headers, the cookies, and so forth.

The DocInfo tab, shown at right, finally goes onto the web page proper, detailing things like the DTD, the title (from the <title> element), meta data (i.e. specific data from the <meta> elements, as opposed to the HTTP header meta-information), included files (stylesheets, images, javascript, etc.), as well as all links on the page.

The Cookie tab, shown at left, shows cookies received from the server. Unlike a conventional web browser, however, you may manipulate the cookies, editing the ones present, as well as adding or deleting any. Just like a regular browser, you have a history list of URLs visited that is persistent across program invocations. WebPageInspector also saves one set of these editable cookies for each URL in your history list.

The final group of tabs displays the contents of the web page itself, in various formats. The HTML tab displays the raw HTML, the source code of the web page. The Text and Content tabs, on the other hand, display variations of the rendering of the web page. The former, shown at left, strips out all HTML elements, but adds minimal formatting that a text-only display could support (i.e. tabs, spaces, and returns), leaving the plain text of the page in a (sometimes) reasonable layout.

Below are examples of the HTML and Content tabs, showing the source and the HTML rendering, respectively.

Go to tech docs

CleanCode -- The Website for Clean DesignRevised 2013.06.30