A Self-Manageable Infrastructure for
Supporting Web-based Simulations
Yingping Huang
Xiaorong Xiang
Gregory Madey
Computer Science & Engineering
University of Notre Dame
Sponsored by NSF/ITR-DEB
Outline

Introduction



Self-manageable infrastructure





Autonomic Computing
Web-based Simulations
Self-Configuring
Self-Healing
Self-Optimizing
Self-Protecting
Conclusion and future work
Autonomic Computing

Motivation


What’s next? – A dozen
information technology
research goals (J. Gray,
Microsoft Research)
Goal
The SysAdmin sets system
goals and high level polocies
http://www.ibm.com/autonomic
 System takes care of itself

Autonomic Computing (cont)

Self-Configuring


Self-Healing


Completed simulations
Self-Optimizing


New simulations, new simulation servers
Efficient usage of system resources
Self-Protecting

No unauthorized access
Web-based Simulations

Features of Web-based simulations






Simulations run on the simulation servers
Simulation data is downloadable for users
Simulation reports are generated dynamically
Simulation status is sent to users by email
Collaboration among users
Challenges




Reliability
Availability
Efficiency
Security
Motivation: NOMSIM





Simulate natural organic matter (NOM)
evolution behavior
Agent-based stochastic simulation method
Multi-disciplinary project that involves
chemists, biologists, environmental scientists,
geologists and computer scientists
Collaboration is essential
Funded by NSF-ITR
Outline

Introduction



Self-manageable infrastructure





Autonomic Computing
Web-based Simulations
Self-Configuring
Self-Healing
Self-Optimizing
Self-Protecting
Conclusion and future work
The Infrastructure
Features of the Infrastructure

Scalability



Web server tier: new application servers can be
added to the balanced application server cluster
Simulation server tier: can be scaled almost
linearly by installing new simulation servers
running identical simulations
Database server tier: real application cluster (RAC)
enables all active instances executing transactions
against a shared database
Features of the Infrastructure

(cont)
Availability



Web server tier: eliminates single point of
failure by redundancy and failover, and
session state is maintained in the database
server tier
Simulation server tier: simulation
checkpointing and resuming
Database server tier: eliminates single
point of failure by redundancy and failover
Simulation Metadata
<simulation name="nomsim">
<db_url>
<url>jdbc:oracle:thin:[email protected]:port:sid</url>
<username>dbusername</username>
<password>dbpassword</password>
</db_url>
<input_part>
<input name="time" type="number" />
<input name="temperature" type="number" />
<input name="granted" type="char(1)" />
<input name="molecule_name" type="varchar2(50)" />
</input_part>
</simulation>
Simulation Manager and
Intelligent Agents


One intelligent agent runs on one simulation server
Functionalities of intelligent agents







Register new simulation servers to simulation manager
Reports metrics of simulation servers to simulation manager
Deploy new simulation models
Check for simulation jobs
Transport data from simulation servers to database servers
Cancel simulation jobs as directed by the simulation
manager
Functionalities of simulation manager


Dispatch and manage simulation jobs
Notify users simulation job status
Self-Configuring
<simulation name="nomsim">
<db_url>
<url>jdbc:oracle:thin:[email protected]:port:sid</url>
<username>dbusername</username>
<password>dbpassword</password>
</db_url>
<input_part>
<input name="time" type="number" />
<input name="temperature" type="number" />
<input name="granted" type="char(1)" />
<input name="molecule_name" type="varchar2(50)" />
</input_part>
</simulation>
HTML form
JSP Code
JavaScript Form Validation
Database Table
Self-Configuring (cont)

On simulation servers


Intelligent agents must run
Install simulation software


(To simplify: the simulation software is installed
on an NFS server and which is mounted on the
simulation servers)
On simulation manager

Email masquerading
Self-Healing

Self-Healing Web servers



Self-Healing simulation servers



Clustered application server instances
Automatic recovery of failed instance
Simulation checkpointing
Simulation resuming
Self-Healing database servers



Clustered database instances
Automatic recovery of failed instance
Raid 0+1
Self-Healing (cont)
Checkpointing
Simulation
Simulation Server Tier
Resuming
RDBMS
Database Server Tier
Self-Protecting

Role based access control




Firewall




Public
Owner
Grant
Port scan
IPTABLES
Log messages scanning
Network traffic monitoring, Intrusion
Detection (Future work)
Self-Optimizing

Self-Optimizing web server tier


Load balanced application server cluster
Self-Optimizing database servers



Database parameter self-tuning
Online index rebuilding
Summary and aggregation
Self-Optimizing (cont)
Time
DBMS
Migrate
Checkpoint
Simulation Server 1
Simulation Server 2
Implementation of self-*

Tools



IBM’s ABLE (agent
building and learning
environment)
Oracle Data Mining
Unix Crontab

Languages



Java
SQL and PL/SQL
Bourne shell scripts
Outline

Introduction



Self-manageable infrastructure





Autonomic Computing
Web-based Simulations
Self-Configuring
Self-Healing
Self-Optimizing
Self-Protecting
Conclusion and future work
Conclusions and Future Work

Conclusions

Self-Manageable infrastructure
Intelligent agents
 Simulation manager


Future work

Applying data mining
“Intelligent agents”
 Proactive critical event prediction
 Job completion time prediction

Questions?
Thank You
Descargar

Infrastructure, Data Cleansing and Data Mining