Basic Wrapper Technology and How To Build A Web Application

C. Frank Starmer and the IT Lab

What are the essential components of a browser (web) application?

The essential feature of a web/browser application is not the actual application, but rather managing the links between the client, server and database/data files. To write the application is usually straight forward - its linear, sequential and easily visualized. Access management (link management) - i.e. managing the communication between the client and server, because communication is asynchronous and discontinuous (i.e. the client is not continuously connected to the server) is difficult. Below we identify the essential components of link management, and describe local tools we have crafted to make access management easy.

Access (Link) Management

  • Authentication: to determine if the user is valid
  • Access control: to determine if the valid user can have access to resources
  • Session management: to limit the dead time an app is accessible via a secure path.

Web apps are a sequence of connect-disconnect transactions between the client and server and some sort of session managment is necessary so that an access to a secure app that remains open for browser access (for example, the phone rings and you don't return to the app for half an hour), or a browser that is not closed at the end of a session has limited "rights" to continue a session.

  • Logging (to track user and access to secure resources)

The application

  • Data flow management/processing between client and a target file/database

    For some apps, we'll choose to write the data flow management and processing ourselves, but for other applications, we may choose to use an already existing application (perhaps a legacy app that is not web enabled) and simply wrap a wrapper (authentication, access control, session management, and logging) around it. For a wrapper to be easy to implement, some sort of path into the application (such as a telnet session, an ODBC path to the database, GETing or POSTing a URL ) must exist.

    This is a collection of html and cgi examples that demonstrate all the steps required to build an MUSC-class application wrapper - demonstrating each of the above features. The main idea is this: We have lots of applications that are not yet web-enabled. We have other applications that are web enabled, but have no authentication or access control. A wrapper is a chunk of code that we wrap around an application - like a candy wrapper: you have to get through the wrapper in order to get to the good stuff. The good stuff, in this case, is the stuff in the database.

    The primary components for data transport between client and server are:

  • Acquire data from a client
  • Manipulate these data via a cgi-script
  • CGI transport of data to and from a database

Outlining Session Control / Managmentment

On top of this, some sort of authentication, access control, logging and session management is needed. Our tools to date consist of:

  • Access authentication and control

    download mod_auth_any.tgz for the apache module

    download authClient.tgz (install authClient in /usr/local/bin and set apache httpd.conf and access.conf to allow all for .htaccess - and probably something else also. Nothing ever works right the first time - oh sigh!

    download an example script

  • Session management (auto timeout of authentication privileges; session.py and expire.py - Matthew will enlighten us here)
    
    Usage: session.py [OPTIONS]... -f -t -i -n
        -f function to perform
            open: open a new session. -fopen -t(required)
            refresh: updates the "time to live" of a session. -frefresh -i(required) -t(required)
            write: write STDIN to a session variable. -fwrite -i(required) -n(required)
            read: read session variable and write to STDOUT. -fread -i(required) -n(required)
            close: close an active session. -fclose -i(required)
        -t time to live in minutes(no fractions). A time of 0 never expires
        -i session id
        -n name of valiable to act upon
    
    
  • Logging of users and data transfer

    Below are example scripts of each component, and then of a full blown example of a wrapper. These are presented with the idea that the easiest way to learn is to first copy existing scripts, install them and see if they work correctly (usually problems with permissions, apache config files, database setup etc). Once the mechanics are behind you, then start to modify these programs such that they do what you want. The strategy is designed to be minimal, as the Kernigan and Richie printf("hello world\n".

    Wrappers involve different strategies for transporting data, but HTTP is working well for some applications. Experiment with GET (a perl program) or cURL as ways to transport information. The IT lab packages information in several different formats. Initially as either HTML or tab delimited records or change the MIME type to MS-excel and push it into an excel spread sheet. Recently we started wrapping data with XML tags. For a cute sample of packaging, take a look at Table Export: which passes a URL to a cgi-script (on herman.itlab.musc.edu) and parses the HTML for table tags, packages the tables and is prepared to export as tab delimited records or dump the table in a spread sheet.

  • Example of MUSC authentication
  • Example of MUSC access control
  • Example of MUSC logging management
  • Example of MUSC session management
  • Example of getting input from a browser
  • Example of putting data into a database

WRAPPER Tools: Authentication, Access Control, Session Management and Logging

Authentication and access control is managed with Apache via a control file named .htaccess:


AuthName "(MNA ID and Password)"
AuthType Basic
AuthAnyUserProg "/usr/local/bin/authClient atrium.musc.edu 10070"
require valid-user

To use access control, managed by apache and controlled by the contents of .htaccess (in the directory to be controlled - usually the directory that contains index.html) - and to coordinate it with the MUSC authentication tools, use Nafee's tool: authClient (available above in webNIS). Modify httpd.conf so that:

#
# This controls which options the .htaccess files in directories can
# override. Can also be "All", or any combination of "Options", "FileInfo", 
# "AuthConfig", and "Limit"
#
    AllowOverride All

To install mod_auth_any, use apxs (from apache-devel***.rpm):

   Unpack the tarball (zcat mod_auth_any.tgz | tar -xf -) 
   Configure with apxs
   apxs -c mod_auth_any.c
   apxs -i -a -n auth_any mod_auth_any.so 
   Ignore complaints about EAPI, and don't forget to restart httpd. 

Move authClient to a convenient place, i.e. /usr/local/bin and then refer to it in .htaccess. authClient talks with an authentication daemon on atrium - and compares the login id and password with /etc/passwd. This really neat tool, facilitates distributed management of access control - while exercising control from Apache - almost too good to be true.

WRAPPER Tools: SOAP and HTTP Transport Tools

For some additional ideas - check out some of the SOAP (simple object access protocol), particularly those linked to PERL module support such as Perl SOAP examples

Something interesting to try is to move information from a web site manually. Try telnetting to port 80 of a web server and manually giving the GET command:

telnet www.itlab.musc.edu:80

after the connection is established run

GET "http://www.itlab.musc.edu/cgi-bin/mySiteMaker/general_search.cgi?  \
	conf_file_name=SiteMaker_new_test.conf"

and you should see the html that is triggered by the cgi-script. Now to change the presentation, we have implemented a parameter: out_type = text, html, xml or excel. So append &out_type=xml to the above and try:

GET "http://www.itlab.musc.edu/cgi-bin/mySiteMaker/general_search.cgi? \
	conf_file_name=SiteMaker_new_test.conf&out_type=xml"

One way of looking at the above GET command (also on unix systems there is a /usr/bin/GET and /usr/bin/POST from the PERL library) is as a sentence: GET "what" where the what is the cgi-script name + command line information that gives specific detail as to what to GET. The output should appear as click here Also available is Click to link with CANNED_QUERIES