The 2,161st Meeting of the Society

April 4, 2003

Dynamic Data Driven Application Systems

Frederica Darema

National Science Foundation

About the Lecture

Dynamic Data Driven Application Systems (DDDAS) is a new paradigm for application simulations that can accept and respond dynamically to new data injected at execution time; conversely, such application systems will have the ability to dynamically control the measurement processes. The synergistic feedback control-loop between simulations and measurements can open new domains in the capabilities of simulations with high potential pay-off, create applications with new and enhanced analysis and prediction capabilities and enable a new methodology for more efficient and effective measurement processes. This new paradigm has the potential to transform the way science and engineering are done, and induce a major impact in the way many functions in our society are conducted, such as manufacturing, commerce, transportation, hazard prediction/management, and medicine. The presentation will discuss the opportunities, the challenges, and will provide examples of applications and ongoing research to enable such capabilities.

About the Speaker

Frederica Darema is the Senior Science and Technology Advisor at EIA and the National Science Foundation's Computer & Information Science & Engineering Directorate, and Director of the Next Generation Software (NGS) and Biological Information Technology & Systems (BITS) Programs. She received her BS degree from the School of Physics and Mathematics of the University of Athens - Greece, and MS and Ph. D. degrees in Theoretical Nuclear Physics from the Illinois Institute of Technology and the University of California at Davis, respectively, where she attended as a Fulbright Scholar and a Distinguished Scholar. After Physics Research Associate positions at the University of Pittsburgh and Brookhaven National Lab, she received an APS Industrial Fellowship and became a Technical Staff Member in the Nuclear Sciences Department at Schlumberger-Doll Research. Subsequently, in 1982, she joined the IBM T. J. Watson Research Center as a Research Staff Member in the Computer Sciences Department and later-on she established and became the manager of a research group at IBM Research on parallel applications. In 1984, she proposed the SPMD (Single-Program-Multiple-Data) computational model that has become the popular model for programming today's parallel and distributed computers. She has been at NSF since 1994, where she has developed the DDDAS paradigm, and is pushing for research in the interface of neurobiology and computing. She is also involved in cross-directorate probrams for Nanotechnolgy Science and Engineering, and the Scalable Enterprise Systems. During 1996-1998 she completed a two-year assignment at DARPA where she initiated a new thrust for research on methods and technology for performance engineered systems.

Minutes

The 2161st meeting of the Philosophical Society of Washington was called to order in the Powell Auditorium of the Cosmos Club at 8:20 PM. President Haapala was in the chair. The president introduced the speaker for the evening, Dr. Frederica Darema, Senior Science and Technology Advisor to the Computer and Information Science And Engineering Directorate, National Science Foundation. Dr. Darema is also Director of the Next Generation Software Program. Dr. Darema received her B.S. degree from the School of Physics and Mathematics at the University of Athens, Greece, and M.S. and Ph.D. degrees in Theoretical Nuclear Physics from the Illinois Institute of Technology and U.C Davis respectively. She was a Fulbright Scholar and a Distinguished Scholar. In 1984 she proposed the computational model used for programming today's massively parallel and distributed computing systems. Dynamic Data Driven Application Systems (DDDAS) are application simulations that can accept and respond dynamically to new data injected at execution time, and/or have the ability to control measurement processes. DDDAS represents a new paradigm for applications, simulations, and measurements methods. Another title that conveys the notion of the DDDAS paradigm, is Symbiotic Measurement and Simulation Systems. Computer simulations start with a theoretical model of the system, an abstract conceptual description of the physical or other system, like a process management system. The theoretical models are expressed in a mathematical representation, and these mathematical expressions are, in turn, coded into computer programs - that's the application or simulation software. Measurements also are made on properties or characteristics of the system. These measurement data are used as inputs to the computer programs, and the application executes and produces results that characterize the system under study. A new instantiation the execution of the application can be generated with a new input data set and the process executes to simulate the behavior of the system with these new conditions. A simulation is validated if the output of the computer model compares closely with observations of the real system. The mathematical functions can be improved or fine-tuned to bring the results of the simulation closer to that of the external system. This is the iterative approach to modeling a physical system. The ability to inject new data as the simulation is running (executing), can result in improvements in the application simulation capabilities. The dynamically injected data at execution time, could be data collected before running of the simulation model, or it could be data collected in Real-Time, while the application is executing. This ability to add data into a running simulation can create more accurate simulations of complex events. Real time data collection is in use today but still often subject to post-processing analysis. A DDDAS application running an analysis could find that an area of interest in the simulation can be more accurately described or analyzed with these additional data. Reversely the simulation can be used to control the measurements to be collected, thus resulting into more efficient measurement process capabilities. For example, for an application using sensor data, the DDDAS approach can be used to reduce the sampling rate from sensors not of interest and increase the sampling rate from sensors of interest thus leading to greater data collection efficiency. There are many challenges in enabling DDDAS capabilities. The are new capabilities that need to be developed at the application level, for example ability of the application to interface with measurement systems. In addition the streamed data might introduce new modalities to describe the system, like a different level of the physics involved, in cases where the analysis is about a physical system. That in turn requires the ability to dynamically select the application components depending on the dynamically streamed data. In addition one needs application algorithms which are amenable and stable to dynamically injected data. In addition the systems support requirements for these kinds of environments, where the application requirements change during the execution depending on the streamed data, need capabilities beyond those of the present applications execution needs. For example, it becomes more difficult to schedule an application since we may not be able to know in advance memory requirements of the application or the duration execution times since these are adjusted dynamically. Some examples of the applications, which would potentially benefit from DDDAS, include: Accurate simulation of how and where a forest fire will propagate. This enables maximum effective allocation of resources for containing the fire and reduction of risk to fire fighting personnel. Current fire models use radiative heat flux models, with more sophisticated approaches coupling these to atmospheric turbulence models to simulate the local wind and vortices generated by the air heated by the fire, as these affect the course of the fire. Improved models would require coupling the atmospheric and wild fire models, together with aerial survey data and ground temperature monitoring systems (sensors) and ability to inject such data and more accurately predict the course of the fire. E-business represents a change of the classical product supply chain. New orders are driven by customers which feed-back simultaneously to self adjusting distribution networks and to product designers simultaneously. Seismic shock resistant and blast resistant buildings could be created using this DDDAS adaptive software. Where support columns have flexible joints or roller bearings. As sensors detect vibrations or pressure waves, the data is fed into a simulation in real time, and the simulation in turn can be used to adjust pressure on the vibration dampers. Consider the World Trade Center collapse. An adequate sensor system coupled with DDDAS capabilities, while it could not have prevented the collapse, could have allowed to predict the collapse and alerted firefighters and others not to enter the buildings. One of the technical challenges in designing DDDAS software is the need to dynamically switch software components and linked libraries while a program is executing without stopping the running application. The computer algorithms must be tolerant to perturbations of dynamically injected data, and able to handle data uncertainties. DDDAS will employ extended spectrum of assemblies of hardware platforms, from the computers where the application executes to the measurement systems, and the networks that connect them and transfer the data. We are now moving beyond parallel computing, into GRID computing, with processing systems consisting of diverse components such as CPUs and distributed even across various locations, and other entities devoted to data collections also most likely in various geographical locations. Modern cars have over 1,000 sensors, specialized microprocessors continuously measuring various states such as air flow, tire pressure, wheel rotation, etc. Some of these do real time comparisons (i.e. anti-lock brakes) while others simply read and report a single measure. In a GRID computing system different CPU's would engage in data acquisition, advanced data visualization, data analysis and synthesis or large scale data base storage and retrieval. All these diverse resources are brought together to solve one single application. National Research Laboratories are using GRID computing now and companies such as IBM are investigating their commercial potential. We now have the requisite computing capability, RAM capacity and computing speed. We have developed some of the software expertise and experience. The time for embarking to develop Dynamic Data Driven Application Systems is now. Dr. Darema then proceeded to list numerous examples of the applications and advantages of DDDAS, a few of which follow. Modeling of the propagation through a population of biological viral agents, useful for homeland security. Such models would analyze various scenarios such as rapid quarantine, the effect of the infected individuals using public transportation, or panic evacuation. Petroleum fields are instrumented to enhance oil recovery. As oil is removed, the boundary conditions in the field change. DDDAS approaches can be used to develop enhanced oil recovery capabilities, and that is enabled in a research project which has attracted the interest of several oil companies as well as computer industry support. The Poseidon project is a coupling of an oceanic current model with plankton distribution dynamics. Plankton observational data collected via satellites and oceanic current data from acoustic sensors are used to drive dynamically coupled ocean-biomass simulations. Such analysis capabilities can result in improved fisheries management as well as efficient methods for oil slick containment. In conclusion, Dr. Darema stressed the value of strengthened academic and industry collaboration, improved technology transfer, and the need for incentives from federal agencies. The speaker then kindly agreed to answer questions some of which included: Q- Could DDDAS be used to enhance our ability to discriminate objects from aerial observations like radar or satellite data? A- Yes it can be used to enhance the existing methods for imaging by eliminating noise and refining the analysis of the data in regions of interest, and result into more accurate object recognition, for example one wants discriminate a tank from a school bus. Q- The new data might require new modes of describing the system. The heart of the issue is how is this done dynamically? A- This is the challenge. Let's consider an example: the case of the crack propagation I mentioned. The hot fluid in the tube and the temperature gradients can cause stresses that cause the crack. Sensors detect the temperature gradients and the stresses, and these measurements can be injected into the ongoing finite element simulation to refine the prediction on the potential onset or extent of a crack. It can be the case that to enhance the prediction accuracy these data need to be further combined for example with molecular dynamics calculations to derive some of the parameters needed in the FEM. See http://www.dddas.org for more examples and details. Q- Is this an expert system? A- No. Expert systems have the ability to learn. DDDAS can request and use additional input data which it did not have at the time the application execution was started, but this is not learning in the expert systems sense. Q- How is the data represented? A- That depends on the application and on the measurements. The data representation is a challenge in enabling DDDAS, and goes beyond the current interoperability issues, because the data models of the applications are not necessarily compatible with the data collected by the measurements, especially where the measurements can be collected by many modes and systems, and geographically dispersed. It is a challenge, GRID computing for example also shows that one does not have all needed bandwidth all the time, and so there is additional complexity of compressing the data, or re-mapping the application. Q- Would this work as a SETI program. A- It's considerably different. SETI analyses a bunch of events at a time, by using the idle time on multiple platforms to distribute chunks of computation. There is no new data injected into an event while it is processed. President Haapala thanked the speaker on behalf of the Society and presented her with a one year complementary membership. The president then made the usual beverage control and parking announcements, then reminded the audience that although the meetings are open to all, membership dues are the sole source of income. He requested non-members to join and members to consider contributions. Attendance: 31 Temperature: 28.3° C Weather: Cloudy and Cool Respectfully submitted, David F. Bleil Recording Secretary