WRF-ARW

From EGEE-see WIki

Jump to: navigation, search

Contents

Application description

One of the most interesting problems related to weather forecasting is the reproduction/forecasting of the airflow over complex terrain. For that reason, the Meteorology VO will also study the interaction of airflow within the complex terrain that appears in many of the South Eastern Europe countries (including Georgia). Such countries have large areas covered with terrain obstacles and it is essential to obtain high-resolution information. The complexity of terrain influences the weather in a variety of ways. Under stable atmospheric conditions, the terrain generates internal gravity waves that distribute momentum over wider areas. These processes may be related to strong winds (e.g., the bora wind) and turbulence that influence the air traffic. At the same time, valleys may be subject to stagnant cold air pool conditions, which may induce health risks in populated areas due to low wind speed conditions. On the other hand, under unstable conditions, convective clouds and precipitation are generated over complex terrain with the possibility of growth into severe thunderstorms. All the mentioned processes occur on spatial length scales smaller than 100 km; usually even smaller than 10 km. In order to reproduce these features, it is necessary to perform numerical simulations at very high resolution that resolves main orographic features of the area.

Meteorological models and the related applications require a large number of numerical calculations and many of them are already parallelised making the porting of them to the grid a natural choice. The SEE-GRID-SCI infrastructure is a regional infrastructure in the area of South Eastern Europe that provides an environment for the development and deployment of the aforementioned application.

The Weather Research and Forecasting (WRF) Model is a next-generation mesocale numerical weather prediction system designed to serve both operational forecasting and atmospheric research needs. It features multiple dynamical cores, a 3-dimensional variational (3DVAR) data assimilation system, and a software architecture allowing for computational parallelism and system extensibility. WRF is suitable for a broad spectrum of applications across scales ranging from meters to thousands of kilometres. WRF modeling system is in the public domain and is freely available for community use.

WRF-ARW model description

The Weather Research and Forecasting (WRF) Model is a next-generation mesoscale numerical weather prediction system designed to serve both operational forecasting and atmospheric research needs. It features multiple dynamical cores, a 3-dimensional variational (3DVAR) data assimilation system, and a software architecture allowing for computational parallelism and system extensibility. WRF is suitable for a broad spectrum of applications across scales ranging from meters to thousands of kilometres, including:

  • Idealized simulations (e.g. LES, convection, baroclinic waves),
  • Regional and global applications,
  • Parameterization research,
  • Data assimilation research,
  • Forecast research,
  • Real-time NWP,
  • Coupled-model applications,
  • Teaching.

The equation set for ARW is fully compressible, Eulerian and non-hydrostatic with a run-time hydrostatic option. It is conservative for scalar variables. The model uses terrain-following, hydrostatic-pressure vertical coordinate with the top of the model being a constant pressure surface. The horizontal grid is the Arakawa-C grid. The time integration scheme in the model uses the third-order Runge-Kutta scheme, and the spatial discretization employs 2nd to 6th order schemes. The model supports both idealized and real-data applications with various lateral boundary condition options. The model also supports one-way, two-way and moving nest options. It runs on single-processor, shared- and distributed-memory computers.


WRF-ARW model workflow


Image:Wrf_workflow.jpg


Target

The main target is to get a powerful application that will be used in the scientific community (e.g. climate change, airflow studies …) and weather forecasting. Target audience is scientific community, which will use the application as a tool in scientific researches; national weather forecasting agencies that will use it for weather forecasting and students for their study researches and as a learning tool.

Porting application to the grid infrastructure will make possible to compute on very large input data sets on a fine-grained net (resolution of few kilometres or less) that require a huge storage space and computational power. The final result will be more accurate and faster weather forecasting that will significantly improve our knowledge about weather changes and everyday living.

Usage scenario

Job management

The WRF-ARW model grid workflow is described with picture below. The main entrance gate to the grid is UI node where are all the scripts that help users to operate the WRF model on the grid are installed.

Image:Wrf_grid_flowchart.jpg


The concept in a nutshell is that the user, working on UI node, submits model to the grid WMS (Workload Management System). WMS allocates the grid CE i.e. grid cluster, and before the model start to execute on the allocated CE, the input terrestrial and initial/boundary conditions data, as well as the WRF binaries, are downloaded on CE from the grid SE (Storage Element). When all input data and binaries are downloaded to CE, the model starts with the execution using the MPI runtime environment.

The model execution starts with the pre-processing system WPS. WPS consists of the geogrid application that prepares the terrestrial data, the ungrib applications that extracts the input initial/boundary conditions data, and after these applications are finished, the metgrid application horizontally interpolates the intermediate-format meteorological data that are extracted by the ungrib program onto the simulation domains defined by the geogrid program.

When the pre-processing is finished the core application starts first with the real application which sets up vertical model levels for the model input and boundary files and finally the wrf application for numerical integration. Only the wrf application is run using MPI on multiple CPUs/cores.

After successful completion, the output data are stored on a pre-defined grid SE. User can download model output data from SE using the LCG-tools, DM-Web 4.1.1 application service developed within SEEGRID project or using developed scripts for automatic WRF model’s data collecting.

Submission of a job is the process that sets all model parameters, describes the job using JDL (Job Description Language), updates the WRF model namelist files (model descriptions files for domains, physics, etc.) and submit model to the grid using gLite-tools. Regarding the job described, WMS allocates a grid CE where the user’s job will be executed. The main script for submitting model on the grid is wrf-submit. The script is responsible for defining and setting almost all parameters needed for the model execution. Based on the user command line parameters the script sets model execution type (scientific or operational), number of processors the model will run on; model input files and some other model specific parameters.

The main model execution scripts on the grid CE is the model_run.sh script. The script is sent from the user UI node together with the job submission. This is the main script on CE responsible for all model operations. The first step is to download and extract model binaries, set file structure, download the static (terrestrial) data from the LFC catalogue and the time-dependent (initial/boundary conditions) data from the LFC catalogue or the NCEP server, depending on the execution mode.

Scientific mode

As said before, the scientific mode is used for the research purposes when the users do not need current boundary conditions data downloaded from NCEP, ECMWF (European Centre for Medium-Range Weather Forecasts) or any other large-scale forecasting center. Users are expected to manually upload and save their initial/boundary conditions data at the LFC catalog in the input data folder for their region. The “raw” folder is used for storing the data in the GRIB file format, typical for describing meteorological data.

As in a serial, non-grid execution of the model, user has to describe his model using the WRF files, namelist.wps and namelist.input. The user needs to define all relevant variables in these two files so that the model can execute normally. The model is executing the by starting wrf-submit command. An example of how to the submit model in the scientific mode, with the region of interest “CROATIA” and using 16 CPUs would be:


$wrf-submit –m s –r CROATIA –p 16


This command automatically checks for the namelist files in the current folder, sets the environment variables, generates JDL job description files and submits the jobs using glite-wms-job-submit command from the gLite tools. The output of the wrf-submit command is a file that contains user’s job ID required to check the job status and to retrieve the output when the job finishes. After the execution has finished, the model output data are stored using the LCG-tools in the LFC file scheme in the folder /WRF-ARW/output_data/<REGION>/sci/<output_filename>.tar.gz.

The user can retrieve the output data on the UI node using the lcg-cp tool (which is the standard LFC tool to copy grid files to UI) or the program wrf-get-data that collects the output of the finished jobs and download the model output results from the grid CE (LFC) to UI. The other option is to directly download the data from the grid CE to the local computer by using the DM-Web server that can be run from web browser.

Operational mode

The operational mode is designed and adjusted to be as simple as possible and not to require much of user intervention for everyday automatic executing. Users have to adjust all model variables in the namelist files, while the start and end date of the simulation are set automatically (the start date is set to the current date). When executing in the operational mode, the input initial/boundary conditions data are downloaded automatically from the NCEP server on the grid CE and the user’s voms proxy certificate is created automatically. An example of starting the operational forecast, using 8 processors, with the simulation time of 48 hours (2 days weather forecast) and “BIH” region would be:


$wrf-submit –m o –r BIH –p 8 –l 48


The output of the command is, again, a file containing the job ID. In the operational forecast, most of the parameters are set by default, e.g. simulation time period (in the example was it 48 h) or number of processors.

During the operational forecast we have faced problems that many jobs have stayed in the queue of the local batch system on CE for a longer time. As operational forecast has to be finished at certain times of the day we have arranged explicit resource reservation on two grid CE in Bulgaria (BG03-NGCC and BG04-ACAD, part of the SEEGRID grid infrastructure). After the resource reservation, the problem with the execution time was solved.

The users who need to run operational forecast on the earmarked resources have to be members of the group /meteo.see-grid-sci.eu/HR/App/WRF-ARW/ with the roles “Developer”, because the reservation is made for this group within the meteo VO.

The Cron job scheduler (other job schedulers are also possible) is required on the UI to automatically submit the model on the Grid daily.

The tools for retrieving the model results and the output data are the same as for the scientific mode.

WRF-ARW storage scheme

As the WRF-ARW is a big and complex application that requires lot of different input and output data we need to store this data on the grid infrastructure. These data are stored in a Logical File Catalogue (LFC). For this reason an LFC storage schema for the WRF-ARW application has been created and used for storing of:

  • Software packages and auxiliary files - Packages containing the model binary executables, additional required scripts and auxiliary data. These are fetched from the LFC to the worker node before the model execution starts.
  • Various model input data - These are static terrestrial data and initial and boundary condition data. These are fetched from the LFC to the worker node before the model execution starts or from the NCEP server.
  • Various data files produced by the models - Such files typically contain the raw output data and the post-processed output.

The figure presents the WRF-ARW storage scheme in the LFC catalogue.



/grid

 /meteo.see-grid-sci.eu
   /WRF-ARW
     /bin
       /WPS
       /WRF
       /postproc
     /input_data
       /terrain
         /raw
         /precompiled
       /boundary
         /<REGION>
           /raw
           /precompiled
     /output_data
       /<REGION>
         /sci
        /oper
          /<YYYYMMDD>


Users who want to use the WRF model on the grid must be members of the meteo VO. As a part of the meteo VO, WRF is using tools for data management provided by this VO. The tool for the data management and replica file system is the LFC catalogue. The root folder for the meteo VO within the LFC catalogue is: /grid/meteo.see-grid-sci.eu/.

In this folder all members of the VO have permission to write, read and modify all files and folders. We have decided to make a root folder for the WRF-ARW model within the meteo VO root folder called WRF-ARW, so that the full path to the model root folder is: /grid/meteo.see-grid-sci.eu/WRF-ARW/.

The users are divided into regions of interest. The users are usually separated by regional affiliation; therefore we have 6 different regions: Armenia, BiH (Bosnia and Herzegovina), Croatia, Georgia, and SEEurope as the default region for all users.

In the root folder there are three main folders: “bin”, “input_data”, and “output_data”.

Bin folder

In the “bin” folder are all the binaries and auxiliary files for pre-processing (“WPS” folder), the main WRF-ARW binaries (the core application) (“WRF” folder), and the post-processing binaries and visualization tools (“postproc” folder). All binaries, depending on the version, are stored in this folder, Figure 65.

Input_data folder

In the “input_data” folder are saved the input data for WRF pre-processor (WPS). There are stored the static terrestrial data that are not changing between different runs, stored in the “terrain” folder, and the time-dependent initial and boundary conditions in the “boundary” folder (which are the input for the ungrib.exe application in the WPS pre-processing system). The initial and boundary conditions are divided between regions of interest because these data are region-specific and to prevent unnecessary mixing of data between different users and regions.

Output_data folder

In the “output_data” folder are stored all the outputs after the model has finished its execution on the grid. Data are separated depending on the region of interest and depending on whether the model was run for operational or scientific purposes (“sci” and “oper” folders). In the “oper” folder data are stored in the folders following YYYYMMDD pattern. The naming convention for the output data is: username_ARW-RAW_filename_HHMMSS.tar.gz. Where username is the user name of the current user on the UI node, filename is the name of the input file (in scientific mode) or YYYYMMDD (in operational mode) and HHMMSS is the unique file code depending on the time the model has been submitted.

All data transfers are performed using LFC through the relevant set of command line tools available on gLite nodes (UIs and WNs) like lcg-cp, lcg-cr and lcg-rf.

Downloads

Source code

The WRF-ARW source code is available for download at:

https://svn.egee-see.org/svn/wrf-arw.see-grid-sci.eu/WRF_ARW_grid_v2.tar.gz

Application source is available for download from svn. Use the following command to check-out the latest stable release:

https://svn.egee-see.org/svn/wrf-arw.see-grid-sci.eu/WRF_ARW_2.0

Documentation

Personal tools