MsgHritScheduler

The WCS-MsgHritScheduler is a more or less simple java application, which observes the filesystem of the DataServer for new incoming files. Those are converted by batch routines into geotiff format (wgs84 spatial reference system) and then added to the mapserver as a new coverage.

Features

  • observes file structure for new files, which need to be processed
  • passes incoming files to external batch routines in order to convert those in more common data formats (eg geotiff)
  • adds converted data as new time coverage to the mapserver by
    • appending timestamp to mapfile
    • adding a new shape geometry to the tile index (consists of a shapefile and a dbf file) using gdaltindex
    • append timestamp to the dbf file
  • if storage capacity on hard drive is less than 1GB, the oldest time coverage will be deleted from mapfile & harddrive

Limitations

  • up to now the scheduler works ONLY with MSG2_HRIT data. In general it would be possible to extend/adapt the scheduler for further products. This can be done by adjusting the preconditions for the new product in the program code (detecting when a product for a certain timestamp is completely received. Addtitionally, the batch file needs to point to an external library which is able to transform the product into geotiff using wgs84 as spatial reference system (can be for example also gdal with extended input reading capabilities).
  • filename-matching and timestamp extraction are done in a quite static (hard coded) way by using e.g. substring methods in jav). up to now this works fairly well, but should be done in a more generic way (e.g. using regex pattern)
  • after the first start, the structure of the dbf file has to be extended manually (appending a column for "time" dimension), since all java libraries change the columns headers using upper case, but Mapserver is pointing to columns "location" and "time" case sensitively
  • from harddrive removed coverages, are not removed from dbf file for same reasons

Get the scheduler running

Pre-Requirements & Dependencies

  • Version of Mapserver installed -> MapserverInstallation
  • an installation of Ilwis available (can be downloaded/requested from ITC ) or GDAL with MSG driver activated
  • java 32bit (since jpathwatch lib is only available for 32bit jvm)
  • gdal toolbox
For the conversion from the raw Geonetcast stream data the program makes use of the GDAL data library with compiled MSG driver (which is the Public Wavelet Transform Decompression Library Software ). Due license restrictions the MSG Driver is not activated by default in GDAL. In order to build GDAL with the MSG driver activated one has to http://oiswww.eumetsat.int/WEBOPS-cgi/wavelet/register from Eumetsat.
For detailed information: http://www.gdal.org/frmt_msg.html

For simplicity we point in the batch-proccessing file to the precompiled GDAL windows version from the Geonetcast Extension of ILWIS (located in Extensions\Geonetcast-Toolbox\MSGDataRetriever\gdalwarp.exe).

In addition we need the gdaltindex, which is being part of the gdal framework. We use here the Fw-Toolbox (collection of several pre-compiled gdal tools). After the installation of FW-Tools we have to add the fw-tools path windows system path variables. In addition we need to creat a new system variable "GDAL_DATA" , pointing to the gdal data folder. This is needed, since we make use of the epsg codes.

add_gdal_data_path.jpgadd_fwtools_path.jpg

Configuration

Step 1: Scheduler configuration

The MsgHritScheduler consists of three files, the .bat-file, the .jar-file and a config file in xml. In order to set up the service we only need the msg_hrit_properties.xml. The following folder/file structure is required:

+ \mainfolder
  • msg_hrit_scheduler.bat
  • msg_hrit_scheduler.jar
+ \config
  • msg_hrit_properties.xml
+ \batches
  • msg_hrit.bat
+ \logs

All variables are read by the scheduler from the properties.xml, so there is no need to edit other files.

For the first time, adjust all the values for (using backslashes "\", since we are working on a windows system) :
  • upload.path - path to the folder, where the converted files should be stored e.g. "C:\ms4w\apps\geonetcast\data\msg_hrit\"
  • mapfile.path - path to the mapfile "C:\ms4w\apps\geonetcast\service\geonetcast.map"
  • msg_hrit.script - path to the .bat script, which is responsible processing the raw data into geotiff files e.g. "C:\geonetcast_processing\batches\msg_hrit.bat"
  • dbf.path - paht to the .dbf index generated by gdaltindex e.g. "C:\ms4w\apps\geonetcast\data\msg_hrit\msg_time_idx.dbf"
  • ilwis.path - path to ilwis root folder e.g. "C:\ilwis\"
  • msg_hrit.folder - path to the folder, where the raw msg2_hrit data is stored e.g. "Z:\classified\OTHER\MSG2_HRIT\" (Z: is here the mounted netdrive from the dataserver)
  • latest.upload -in this field the scheduler automatically stores which timestamp has been converted lastly e.g. "2010-09-22T06:00:00", so only files with newer date will be processed without the need to re-check the whole filesystem. This can be helpfull e.g. after system restart.
    If you start the scheduler for the first time enter one timestamp before the one which should be processed. Example: The first file, which needs to be processed from the DatServer has the timestamp "201007151800", then you should enter "2010-07-15T12:00:00"

Step 2: Adjust java path in the .bat to location of 32bit-Java VM

Adjust the path in the msg_hrit_scheduler.bat (right click -> edit with texteditor) to where your 32bit-Java VM installation is located.

Step 3: Run the Scheduler for one processing cycle

In order to test everything run the msg_hrit_scheduler.bat. Abort the scheduler wit "STRG+C" after the first file has been completely processed.

Step 4: DBF manipulation

In order add each processed geotiff as a new time dimension coverage to the Mapserver, we have to do a small hack by misusing the tile index mechanism for time indexing (for more elaborated explanation have a look on the official documentation and the mapserver-mailing-list). This can be done either using a Postgis database or a simple shape/dbf index.

Since this is a simple "Proof-of-Concept" Use-Case, we decided to use the Shape/DBF Index. However, it might be reasonable to think about switching to a Postgis Index (see remarks).

After each processing cycle, the gdaltindex is started (defined in the batch file). gdaltindex automatically creates a geometry for every new geotiff in the shape file and adds its location path to the dbf file. If no index exists, gdaltindex automatically creates a new one. This means, after the first processing cycle the index should be visible in the geotiff folder.

dbf_indexfiles.jpg

Now initally, we have to edit the DBF File structure (the table schemata). More precisely, we need to append a new column "time" for the time dimension. This can be done, e.g. using tools like Dans DBFExplorer.

dbf_manipulation.jpg

Afterwards, we can close the dbf file and restart the scheduler again. Now, the scheduler automatically updates the scheduler also updates "time" column in the dbf file according to the timestamp given in the filename.

How does the scheduler work?

The java program itself just checks for new files. The file Structure of the incoming MSG files should look like this:

  • All files a located in dated folders (according to our configuration in DataManager), so the path is: \\GNC-DATA\Geonetcast\classified\OTHER\MSG2_HRIT\{year}\{mont}\{day}
  • A product in raw data can consists of different sub-files like "H-000-MSG2__-MSG2________-WV_062___-000008___-201007151800-C_" received by the groundstation at different times. In order to convert a product into another file format like geotiff, all of these files are needed. Since this differs from product to product in various ways, there is no common "generic" procedure to check for completeness of a product for a certain timestamp. F
    For MSG Products, we know there are the prologue and the epilogue files
    • Prologue: H-000-MSG2__-MSG2________-_________-PRO______-201007151800-__
    • Epilogue: H-000-MSG2__-MSG2________-_________-EPI______-201007151200-__
first the PRO-file, then the data and then lastly the EPI-file gets send by the dissemination system. So only if a EPI file arrives (meaning the product is complete received), the scheduler starts the .bat-file with the appropriate parameter.
  • the .bat-file starts the conversion using gdal:
    • \ilwis\Extensions\Geonetcast-Toolbox\MSGDataRetriever\gdalwarp.exe --config GDAL_CACHEMAX 30 -t_srs "+proj=latlong" -of GTiff MSG(%1\,%2%3%4%5,(1,2,3,4,5,6,7,8,9,10,11),N,B,1,1) %6.tif
    • the values with a ‘%’ in front of them are parameters, which are read by the scheduler from the properties: 1 Raw data absolute path; %2 Year; %3 Month; %4 Day; %5 Time; %6 Store data
    • proj=latlong defines the reference system
    • ,(1,2,3,4,5,6,7,8,9,10,11) this are the different channels which should be taken into account
  • subsequently the line "gdaltindex -write_absolute_path \msg_time_idx.shp \*.tif" adds the new coverage to the shapeindex, pointing to the locations of the geotiff files according to their timestamp

Remarks & Things-to-do

  • might be reasonable in future work to think about how to integrate the scheduler directly into the DataManager, since it already knows when new products have completely received, so there is no need to observe the file system with additional libraries
  • try to use PostGis instead of Shape/Dbf index (might be faster and the time column can simply automatically created by simple self-programmed trigger)
  • WMS is sometime very slow: try out sub-tileindexing and other interpolation methods
  • try out different rendering styles for WMS
  • extend/redfine metadata parameters in mapfile
  • time dimensions actually are not encoded in the Describe-Coverage Response of the WCS 1.1.0 (might be a bug of mapserver or erroneous metadata elements/configuration in the mapfile)
-- JohannesTrame - 2010-09-23
Topic revision: r10 - 24 Sep 2010, JohannesTrame
Legal Notice | Privacy Statement


This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Wiki? Send feedback