Application Web Service Toolkit Geoffrey Fox, Marlon Pierce, Ozgur Balsoy Indiana University July 24 2002 Motivating Concerns • Grid infrastructure software tends to be geared toward low level services: – Job submission and monitoring – Information services – Distributed security • Typical end users are really interested in running specific applications, not using low level grid tools. – GCE computing portals build user-centric tools out of Grid infrastructure tools. – The grid infrastructure is a great foundation for this. • Not all HPC resources use Grid infrastructure technologies in production but still see the value of portals – DoD MSRCs are good examples – We need service definitions that are independent of implementation • There are no well defined general ways for – Describing scientific applications – Specifying the Grid services these applications require – Adding these applications to a Grid or to a computing portal Some Stakes in the Ground (and a rough outline) • We need interfaces to describe real scientific applications and their execution life cycles. • Application ‘bundles’ are composed of core, general purpose Web services. – Need composition languages • Application interfaces define some related Web services useful to portals – Description, archiving • Application bundles can also be composed to form larger applications. • Application interfaces can be used to automatically generate browser interfaces. • Browser interfaces to services are portlets, can be aggregated into portals (such as with Jetspeed). Application Web Services Toolkit • We are building a portal infrastructure to implement the ideas on the previous slide. • Our goal is to be application-centric – By “application” we mean some scientific or engineering code (finite element codes, quantum chemistry codes, etc). – Applications are added to the portal in well defined ways. • The application interface is defined by the developer (or application expert) through well defined XML schema. • The application is bound to various core service interfaces. – Core service interfaces are in WSDL, will be in OGSA – Applications can be statically bound to core services (and specific resources), as we describe here. – Binding can also be dynamic, through service discovery mechanisms. Application Lifecycles • Abstract State: describes potential uses of the application. – Analogous to a.out on a file system • Ready State: a specific application instance has been set up but not run • Submitted State: Application instance is running – Analogy: sh a.out • Archived State: The job is completed and metadata about the instance is stored. – Metadata can be queried later as a service – Archived instances can be used to create new instances. • We need ways of describing all of these states. Application Web Service Bundles • An application is composed of several core services. – Application description, batch script generation, job submission, job monitoring, file transfer. • These should be deployable on a specific compute resource. • These become services on a host resource. • The application should have two parts: – Describe how to invoke – Describe ‘workflow’ of how the core services interact Application Web Service Interface Job Submit Batch Script Generation File Transfer Application Description Application Composition • Application services on specific resources can be combined to create aggregate applications. • That is ‘Run code 1 on machine 1, use output for code 2 on machine 2, and use visualization service on machine 3’ • We are currently trying to decide the best way to do this workflow. Composite AWS AWS 1 AWS 2 AWS 3 Application Web Service Schemas • From the proceeding, we have two sets of schema: – First set defines the abstract state of the application • What are my options for invoking disloc? • Dub these to be “abstract descriptors” – Second set defines a specific instance of the application • I want to use disloc with input1.dat on solar.uits.indiana.edu. • Dub these to be “instance descriptors”. • Each descriptor group consists of – Application descriptor schema – Host (resource) descriptor schema – Execution environment (queue or shell) descriptor schema Container Structure for Descriptors • Applications contain hosts, which contain execution environments. • Each of these are independent schemas that get included in the parent with a “binding” tag that uses <xsd:anyType> – We were inspired by WSDL bindings here. • This keeps schema scope manageable, makes schema pluggable – I can use someone else’s schema for my host descriptor if I like theirs better than mine. Application Schema Elements • The host and environment descriptors are the usual stuff, so we won’t go over that here. • Abstract descriptors describe possible invocations of the application. – Edited by application deployers, used to generate user interfaces • Instance descriptors describe particular invocations of the application. – Created from user interaction with portal interface, stored as application archive Abstract Application Schema Elements: Application Application Element • Each application is defined by a number of fields – Name and version tags are strings – Flag tag lists code flags that can be set and how to use. – Input, output, and error ports describe how the code handles I/O and error. – ApplicationParameter tag provides a general purpose place to put arbitrary name/value pairs if you need more descriptions. – HostBinding tag contains the binding to the host description schema. Abstract Application Schema Elements: InputPort InputPort Explanation • You need one input port for each piece of input that the code expects. – Typically just an input file. • InputHandle is a useful short name for referring to this input port. • InputDescription is a longer string that allows the application deployer to give a much longer description of what this input port is used for. • InputMechanism describes how the input for the code is aquired. – Right now, just a file from local disk. – Will support services to load input from specified resource (local file, URL, database, etc.). • Output and error reports are similar. Application Descriptor Services • After defining the schema, next step is to cast into language bindings – We used Castor to unmarshal XML to Java • Ultimately, we will make this a Web service – Get/set and query methods for the Application are obvious candidates for turning into a WSDL interface. • So the application descriptions all will live in a repository independent from the user interface server. Application Instances • Host and execution environment (queue) descriptors are familiar. • Application Instance descriptors are superficially similar to the previous abstract descriptors. – Contain name, version, input/output/error ports, host bindings, etc. – Recall the difference is that the instance is a specific subset of possibilities described by the abstract description. Application Instance Schema Services • Again, we just cast the schema into java code. • We then implement JavaBeans/JSPs for manipulating the schema. – Applications can be deployed in the portal • Methods for working with schema instances/java code will be used to define a Web service – The instances are stored in a repository logically separate from the user interface server. – The repository might be a file system or an XML database or whatever, but this implementation detail is hidden behind the interface. Automatic Interface Generations with Schema Wizards • Gateway schema pages are currently one shot development efforts. – We map HTML forms to a specific schema. – Form actions update schema instances through Castor generated classes. • More generally, we want to be able to develop general purpose schema elements to GUI widgets (HTML or otherwise). – We call this sort of thing a schema wizard. Schema Wizard Architecture Velocity Velocity Templates Templates Velocity Templates Castor SOM Schema Processor XML Schema Castor SourceGenerator Javabeans Javabeans Javabeans Velocity Velocity Templates JSP Templates and HTML forms Schema Wizard Explanation • A schema is read in to create an in-memory representation (SOM) and also to create Java files. – SOM=Castor’s Schema Object Model • Each schema element is mapped to a self-contained JSP nugget. • JSP nuggets are generated from templates. – One template for each element type (simple, complex, enumerated, unbounded,….). – Velocity is used for convenient scripting of JSP. • The final JSP page is an aggregate of the JSP nuggets files (using <%@:include>). • Complex schema elements are mapped to JavaBeans generated from the schema with Castor. – Scripting templates set up the imports Where Are We Really? • Many core Web service implementations developed. • Application schema are available and have been implemented in a demo portal. • Schema wizard is still in development. • Lots of work on remote portlets for Jetspeed – Navigable, session maintaining, form parameter passing portlets developed. – Still need to work out security. – Still need to incorporate schema wizard as a portlet. References • See http://grids.ucs.indiana.edu:9000/slide/ptliu /research/gateway/AWS/AWS.doc for a short report and lots of XML Spy generated schema documentation. • See http://grids.ucs.indiana.edu:8045/GCWS/S chema/index.html for the schemas.