|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object com.cleancode.net.WebPageInspector
public class WebPageInspector
An interactive web page inspector that separates and identifies the components of a web page, including extracting the text of the page and a limited browser view of the page. This is an tool to analyze the bits and pieces (some might say flotsam and jetsam) comprising a web page, from details about the connection and the URL to the representation of the page as plain text and rendered as HTML. For those familiar with the 7-layer model of networking, I like to think of this as the 8-layer model of a web page. You specify a URL which is then fetched from the web. The connection and the page are analyzed into these components, each available as a separate tab in the program: connection, URL, header, document info, cookies, HTML text, plain text, and rendered web page content. (Example illustrations of each tab is available here.)
WebPageInspector is a GUI application constructed with standard Swing components, enhanced by my own com.cleancode.swing.* modules. A JComboBox is main recipient of user input other than menu commands and buttons. One enters a URL in the box. That URL is then added to the list of choices in that JComboBox for quick repeat selection. Opening the pull-down menu of the JComboBox and selecting an earlier entry also moves that entry to the top. Hence, the pull-down serves as an ordered history list, most recent first. If you wish to load a web page on your local drive, use the open file command in the File menu. This will also be entered into the history list in the pull-down for subsequent use.
Entering a new URL via the keyboard, selecting one via the pull-down box, or just executing the refresh command on the current URL will initiate a fetch and analysis cycle. The progress of the cycle is displayed, along with an elapsed time, in the status bar, just below the JComboBox. The fetch is the same as if a URL was entered in a browser, i.e. the URL is sent over the world wide web and the corresponding web page is returned. The analysis involves taking slices of the web page data and its meta-data to populate each of the display component tables, corresponding to the tabs in the UI.
WebPageInspector uses the standard CleanCode diagnostics facility
for logging. This allows you to trace the behavior of the program
to any level of detail you desire.
See the Diagnostic
module
for further details.
Since WebPageInspector is a GUI application, directing diagnostic output
to a log file is the most appropriate.
You specify where to put this log file in the configuration file.
Information about the server connection is presented in the first group of tabs. The Connection tab provides details available from the network transaction itself, including the full URL as known by the server. The URL is then broken down into its components on the URL tab, including the protocol (http, https, etc.), the port, the query (the portion following the question mark), and others. The Header tab displays information from the HTTP header of the transaction, i.e. meta-information sent with the web page but not on the web page proper. It includes, for example, the server response (200 OK vs. 404 not found, etc.), the type of the content (text, html, pdf, etc.), the length of the web page, the received cookies, the date of the transaction, the server version, etc. The cookies are further broken down on the Cookie tab, described alter.
The Doc Info tab contains details extracted from the web page proper, including the DTD, the title (from the <title> element), meta data (i.e. specific data from the <meta> elements, as opposed to the HTTP header meta-information), included files (stylesheets, images, javascript, etc.), as well as all links on the page.
The Cookie tab separates out the components of each cookie sent by the server (the received cookies), including the name, value, domain, path, expiration, and security setting. The related Xmit Cookie tab handles transmitted cookies i.e., cookies that you send back to the server. You may choose to send cookies or not with your URL. In the Cookie menu, you'll observe the 3 possibilities: no cookies, entered cookies, or stored cookies (entered and stored cookies refer to the tables on the Xmit Cookies tab). The stored cookies table provides a permanent repository, whereas the entered cookies table serves as a scratch pad. You may (via the cookie menu) use either of these when you fetch a URL. If you wish, for example, to simply echo the same set of cookies you received when you fetch the URL, first copy the received cookies to the entered cookies table (via menu or via button on the Xmit Cookies tab). Then set the cookie mode to entered cookies on the Cookie menu.
You may add, delete, or edit the cookies in the entered cookie table. Since the received cookie table (in the Cookie tab) is overwritten with each fetch of a URL, the stored cookie table is provided as a storage area to, for example, keep a copy of a cookie set before you start editing it. Note that the stored cookie table is not directly editable. To edit stored cookies, you must transfer the cookie set to the entered table, edit, then transfer them back to the stored cookie table.
The stored cookie table is also used for persistence across invocations. That is whatever cookies you have moved to the stored cookie table for the current URL will be available the next time you run WebPageInspector. When you enter a URL in the input field (either typing or selecting from past history in the pull-down) along with fetching the URL and getting new received cookies, the stored cookie table will be reloaded with the last cookies you specifically stored there. So this allows you to, for example, compare cookies between one fetch and another, even if it was something you saved last month. If you wish to scan through the cookies that you have stored for various URLs in your history, switch to offline mode, then use the pull-down in the input field to switch to each URL (or Ctrl-N and Ctrl-P to scroll through them). Offline mode brings the selected URL to the top, and loads the stored cookies (if any), but does not fetch the contents of the URL from the net.
The next group of tabs displays the contents of the web page itself, in various formats. The HTML tab displays the raw HTML, the source code of the web page. The Text and Content tabs, on the other hand, display variations of the rendering of the web page.
The Text tab gives a text-only representation of the web page, i.e. stripping out all HTML encoding, and adding minimal formatting that a text-only display could support (i.e. tabs, spaces, and returns).
The Content tab is a mini-browser using Java's native support for HTML display. Unfortunately, this is not terribly robust. The web pages I have tried just don't display with very clean formatting. I have, at least, added hyperlink support on the page with a chunk of code I found from Sun, so links on the page will work, and will add to the JComboBox and history list, just as if you had typed in the URL.
There are several files WebPageInspector users should be aware of.
First, the configuration file CONFIG_FILE
must be stored
in the current directory (from which you execute the program).
This configuration allows you to specify diagnostic settings
as well as the location of the other files used by WebPageInspector.
The history file--used to store URLs and cookies that you visit,
for later recall--is by default stored in
StorageMgr.DEFAULT_WPI_HISTORY_FILE
,
but you may specify a different location by setting the
StorageMgr.HISTORY_FILE_PARAM
parameter.
Note that the history file is automatically created/saved when
you exit the program, but you may force a save at any time via the
Save History command on the File menu.
Finally, if you enable diagnostics to output to a log file,
the log directory is specified via the LOG_DIR parameter in
the configuration file. The default is
Diagnostic.DEFAULT_LOG_DIR
.
Field Summary | |
---|---|
static String |
CONFIG_FILE
WebPageInspector configuration file name. |
String |
VERSION
Current version of this class. |
Constructor Summary | |
---|---|
WebPageInspector(JFrame frame)
Creates an instance of a WebPageInspector. |
Method Summary | |
---|---|
void |
doAbout(ActionEvent event)
Command: ABOUT -- describe program. |
void |
doAddCookie(ActionEvent event)
Command: ADD-COOKIE -- add row for cookie. |
void |
doClear(ActionEvent event)
Command: CLEAR -- Erase all URL history. |
void |
doClearCookies(ActionEvent event)
Command: CLEAR-COOKIES -- erase entered cookie table. |
void |
doCopyRecvCookies(ActionEvent event)
Command: COPY-RECV-COOKIES -- copy received cookies. |
void |
doCopyStoredCookies(ActionEvent event)
Command: COPY-STORED-COOKIES -- copy stored cookies. |
void |
doCopyTab(ActionEvent event)
Command: COPY-TAB -- copy tab contents to clipboard. |
void |
doDelCookie(ActionEvent event)
Command: DELETE-COOKIE -- delete row for cookie. |
void |
doDeleteUrl(ActionEvent event)
Command: DELETE-URL -- delete current URL from JComboBox. |
void |
doExit(ActionEvent event)
Command: EXIT -- exit. |
void |
doOpenFile(ActionEvent event)
Command: OPEN-FILE -- open local file. |
void |
doPasteCookie(ActionEvent event)
Command: PASTE-COOKIE -- paste cookie on clipboard into 'entered' table. |
void |
doRefresh(ActionEvent event)
Command: REFRESH -- fetch current URL again. |
void |
doSaveHistory(ActionEvent event)
Command: SAVE-HISTORY -- save history to a file. |
void |
doSaveHtml(ActionEvent event)
Command: SAVE-HTML-FILE -- save string to a file. |
void |
doSaveText(ActionEvent event)
Command: SAVE-TEXT-FILE -- save string to a file. |
void |
doSetEnteredCookieMode(ActionEvent event)
Command: SET COOKIE MODE -- entered. |
void |
doSetNoCookieMode(ActionEvent event)
Command: SET COOKIE MODE -- none. |
void |
doSetOfflineStatus(ActionEvent event)
Command: OFFLINE -- toggle offline state. |
void |
doSetStoredCookieMode(ActionEvent event)
Command: SET COOKIE MODE -- stored. |
void |
doShowNextTab(ActionEvent event)
Command: NEXT-TAB -- show next tab. |
void |
doShowNextUrl(ActionEvent event)
Command: NEXT-URL -- show next URL in JComboBox. |
void |
doShowPrevTab(ActionEvent event)
Command: PREV-TAB -- show previous tab. |
void |
doShowPrevUrl(ActionEvent event)
Command: PREV-URL -- show previous URL in JComboBox. |
void |
doStoreCookies(ActionEvent event)
Command: STORE-COOKIES -- store cookies. |
static void |
main(String[] args)
Main routine to operate WebPageInspector as a standalone GUI application. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public static final String CONFIG_FILE
public final String VERSION
Constructor Detail |
---|
public WebPageInspector(JFrame frame)
frame
- JFrame object in which to build GUI.Method Detail |
---|
public void doStoreCookies(ActionEvent event)
event
- actionable eventpublic void doCopyStoredCookies(ActionEvent event)
event
- actionable eventpublic void doCopyRecvCookies(ActionEvent event)
event
- actionable eventpublic void doDelCookie(ActionEvent event)
event
- actionable eventpublic void doAddCookie(ActionEvent event)
event
- actionable eventpublic void doClearCookies(ActionEvent event)
event
- actionable eventpublic void doSaveHistory(ActionEvent event)
event
- actionable eventpublic void doOpenFile(ActionEvent event)
event
- actionable eventpublic void doSaveHtml(ActionEvent event)
event
- actionable eventpublic void doSaveText(ActionEvent event)
event
- actionable eventpublic void doRefresh(ActionEvent event)
event
- actionable eventpublic void doShowNextTab(ActionEvent event)
event
- actionable eventpublic void doShowPrevTab(ActionEvent event)
event
- actionable eventpublic void doShowNextUrl(ActionEvent event)
event
- actionable eventpublic void doShowPrevUrl(ActionEvent event)
event
- actionable eventpublic void doSetOfflineStatus(ActionEvent event)
event
- actionable eventpublic void doSetNoCookieMode(ActionEvent event)
event
- actionable eventpublic void doSetEnteredCookieMode(ActionEvent event)
event
- actionable eventpublic void doSetStoredCookieMode(ActionEvent event)
event
- actionable eventpublic void doClear(ActionEvent event)
event
- actionable eventpublic void doCopyTab(ActionEvent event)
event
- actionable eventpublic void doPasteCookie(ActionEvent event)
event
- actionable eventpublic void doDeleteUrl(ActionEvent event)
event
- actionable eventpublic void doExit(ActionEvent event)
event
- actionable eventpublic void doAbout(ActionEvent event)
event
- actionable eventpublic static void main(String[] args) throws IOException
args
- command-line settings to override configuration file.
IOException
- if default configuration file cannot be read
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
CleanCode Java Libraries | Copyright © 2001-2012 Michael Sorens - Revised 2012.12.10 |