Web Automation Creation Tools - Web Automation Related Work

CHAPTER 2 RELATED WORK

2.1 Web Automation Related Work

2.1.2 Web Automation Creation Tools

Many business systems are available that transform the Web browser from an occasionally informative accessory into an essential business tool. Business units that have previously been unable to agree on middleware and data interchange standards for direct communication are agreeing on communication through HTTP and HTML, which needs human intervention (Figure 2-1). The need of manual operation may become highly inefficient when a lot of transcription or copy and paste operations are part of the daily job. The goal of the Web Interface Definition Language (WIDL)[10][11] is to enable automation of interactions with HTML/XML documents and forms, accepting the Web to be utilized as a universal integration platform without efficiency problems.

Web Site A

Figure 2-1: The need for Web automation

WIDL uses the XML standard to define interfaces and services, mapping existing Web content into program variables, allowing the resources of the Web to be made available for integration with business systems. It brings to the Web what is similar in IDL concepts that were implemented in standards such as CORBA for distributed computing. WIDL describes and automates interactions with services hosted by Web

Web Site B

servers on intranets, extranets and the Internet; it provides a standard integration platform and a universal API for all Web-enabled systems.

A service defined by WIDL is equivalent to a function call in standard programming languages. What WIDL defines is how to “Make a call” for a Web service. To make the call, it defines the location (URL) of the service, input parameters to be submitted, and output parameters to be returned. Note that like IDLs, a standard programming language is needed for further processing of the data, and a browser is not required any more.

The use of standard Web technologies empowers various IT departments to make independent technology selections. This has the effect of lowering both the technical and political barriers that have typically threatened cross-organizational integration projects. Here is a brief overview of applications that WIDL enables:

¾ Manufacturers and distributors:

z Access supplier and competitor ecommerce systems automatically to check pricing and availability

z Load product data (specification sheets) from supplier Web sites

z Place orders automatically (i.e. when inventory drops below predetermined levels)

z Integrate package tracking functionality for enhanced customer service

¾ Human resources:

z Automated update of new employee information into multiple internal systems

z Automated aggregation of benefits information from healthcare and insurance providers

¾ Governments:

z Kiosk systems that aggregate data and integrate services across departments or state and local offices

¾ Shipping and delivery services:

z Multi-carrier package tracking and shipments ordering

z Access to currency rates, Customs regulations, etc.

Shipping Companies were early leaders in bringing widely applicable functionality to the Web. Web-based package tracking services provide important logistics information to both large and small organizations. Many organizations employ people for the sole purpose of manually tracking packages to ensure customer satisfaction and to collect refunds for packages that are delivered late. Integrating package tracking functionality directly into warehouse management and customer service systems is a huge benefit, boosting productivity and enabling more efficient use of resources.

Using WIDL, the Web-based package tracking services of numerous shipping companies can be described as common application interfaces, to be integrated with various internal systems. In almost all cases, programmatic interfaces to different package tracking services are identical, which means that WIDL can impose consistency in the representation of functionality across systems.

2.1.2.2 WebVCR[1]

When the Web becomes more interactive, personalized and rich in content, increasing complexity of manipulation seems to be inevitable. The result is that users are forced to go through several steps and fill out a sequence of forms before reaching the desired results. For example, consider using a travel site. The steps are: (1) Go the travel site url; (2) Choose the Find/Book a Flight option; (3)Login; (4) Fill a form with the details of itinerary. However, it is likely that a single visit to the results page will be insufficient. It may take weeks or months to find an acceptable fare, and the process need to be repeated every time to reach the results page. What WebVCR do is

to record the browsing steps and replay it later, as many times the user needs. The saved sequence is called a smart bookmark, which differs from conventional bookmarks that can only save one-step reachable pages.

There are two different implementation architectures, client-based and server-based.

In the client based version, the WebVCR is implemented as a Java applet that runs with the user’s browser. The user starts the WebVCR by loading the WebVCR starting page into a browser window (main window), which immediately opens another browser window (applet window) to load the HTML page containing the WebVCR applet, making it persistent during record and play sessions. To record a smart bookmark, the user traverses the Web to the desired starting point and clicks on the Record button in the applet. Clicking on the Record button causes two actions to take place (which are transparent to the user): (1) the applet records the current URL as the starting location of the smart bookmark; (2) the applet inserts event handlers on all elements in the main window to capture whatever the user may do. From then on, as the user navigates via link traversals or form submissions, each action triggers the inserted event handler that causes the applet to record the corresponding action. When the user finally reaches the desired page, he clicks on the Stop button and WebVCR stops recording.

In the server-based version, the implementation is much more complex. Many issues that a browser could handle need to be implemented in the server. Javascript handlers are no longer a valid option for detecting browsing actions. Cookie handling and https connections claim additional work in the server part.

2.1.2.3 LiveAgent[1]

LiveAgent allows developers to create Web automation agents by recording the browsing process. While a developer records an agent, LiveAgent’s agent engine, also

called the AgentSoft proxy, intervenes between the browser and the Web by altering the Web pages being browsed so that user events can be monitored and recorded. The proxy monitors browsing sessions and inserts appropriate code into browsed Web pages. This involves adding and routing event handlers for whatever the user may change, such as input fields, links and buttons. To keep persistent data across pages, a hidden frame is needed, with the main frame used for browsing.

Whenever the user records an event of the browsing process, the proxy must try to understand the user’s action for future replay. For every event, a window pops up asking the user to specify their intentions, which allows the proxy understand the user action. For example, for positioning a hyperlink, it could be understood as clicking on the fifth link on the page, or the link with word “profile” in the link anchor text.

Additional flexibility is given to the recorded process by defining parameters for the whole process and conditional branching. For example, “If it is Sunday, follow the link to the cross-word puzzle and retrieve it.” If it is not Sunday, then the link is simply not followed and another node is visited. There can also be loops with tasks

“Download stories while there are still stories about the Internet” and interdependencies with conditions “If the Sunday puzzle was downloaded, then download the sports section too.” The definitions given are kept by using AgentSoft’s HTML Position Defition Language (HPD). Result extraction and report format definition is also part of the HPD language.

The MasteAgent tool is another part of the package. It combines several created agents with LiveAgent to collect information in parallel and then merges the information into a single report. Further processing on the results requires the developer to use java.

2.1.2.4 Internet Scrapbook[3]

Internet Scrapbook automates users’ daily browsing tasks using a programming by demonstration technique. In Scrapbook, users can demonstrate in which portions of Web pages they are interested by selecting them on a Web browser. Once the personal page is created, the system automatically updates it by extracting the user-specified portions from the newest Web pages. Thus, the user can browse only the necessary information on a single page and avoid repetitive access to multiple Web pages.

Scrapbook generates a matching pattern when the user selects the desired data from the Web browser. Therefore, the pattern should contain information that is expected to remain constant even after the source page has been modified. It uses two kinds of descriptions to define matching patterns: heading pattern and tag pattern.

Heading pattern assumes that headings such as “Top News” and “Economy” are preserved in the news page while the articles following these headings keep changing.

Scrapbook assumes that when the user selects data, the previous line, the first line and the first line after the selection area are permanent headings. The tag pattern represents the position of the selected data in the Web page by using the HTML elements.

在文檔中網際網路自動化平台設計及其應用 (頁 13-18)