WPS4R Development
Source Code
The source code is available in the WPS master branch on
GitHub:
https://github.com/52North/WPS
Developers
- Matthias Hinz
- Benjamin Proß
- Daniel Nüst
Sub-projects
Contributor Documentation
WPS4R Source Setup
For the source setup, please check out the 52n WPS tunk and follow these instructions:
http://52north.org/communities/geoprocessing/wps/source_setup.html.
Afterwards follow the
setup instructions of WP4R, starting with step 2.
If you already got a
locally running Rserve instance and don't want to customize anything, you can almost skip the these instructions, but have a look at step 5, to ensure that required R libraries are pre-loaded and the default work-directory is chosen properly.
If you want to connect
WPS4R with a
remote Rserve instance, you will need to add configuration parameters to
WPS4R. They are described in step 8 of the
WPS4R setup instructions.
Module overview
The module corresponding to
WPS4R is named "52n-wps-r", it contains a single package of name "org.n52.wps.server.r". Its most important classes and relationships are diplayed by the class diagram below.
Click the image to enlarge.
Important classes
LocalRAlgorithmRepository (implements
ITransactionalAlgorithmRepository) Acts as module repository for R-processes. For each of the R-scripts, which are available in a specific folder, an instance of
GenericRProcess is created.
GenericRProcess (extends
AbstractObservableAlgorithm) instantiates proxy objects for each R-process. It generates the process description by invoking the
RProcessDescriptionCreator, handles Inputs and Outputs, executes the related R-script and all operations related to Rserve.
RPropertyChangeManager (implements
PropertyChangeListener) is implemented as a Singleton and register as an Observer of the WPSConfig. It synchronizes the properties of the "LocalRAlgorithmRepository" listed in the WPSConfig-file with the actual properties of the
WPS4R module. This class (re-)acts when R-scripts are uploaded, "algorithm"-properties are removed/disabled or properties related to RServe (e.g. R_HOST, R_PORT) are changed within the config-file. On repository start-up, it will
add missing properties to the RConfig-file, like "algorithm"-properties for unregistered R-scripts and default values for the RServe-connection. Finally it takes care about the order of properties listed for the repository.
R_Config bundles information about common parameters and configurations with static methods related to R, RServe and the
WPS4R module in general. This includes mappings from R-script files to process identifiers and reverse and constants denoting important directories. Each connection to RServe shall be established via the static method "openRConnection()".
R_Annotation implements the complete semantic-definition of
WPS4R-Annotations. It contains of enumerations denoting each symbol or keywords and its properties. At the same time objects of class R_Annotation are instantiated by
RAnnotationParser. Each one is a proxy-Object for one annotation. Lists of R_Annotion-objects are passed to
GenericRProcess and the
ProcessDescriptionCreator to extract certain information about specific R-processes.
User Stories
The following use stories were used to define the main aspects and use cases of
WPS4R. Afterwards a first prioritized list of use cases was developed in form of a road map (see below).
A) Model Web
The model web brings data, processes and result presentations to online services. Within the Model Web, the
WPS4R can be connected to different data sources, e.g. SOS or WMS servers, but also to each other. A concrete example is a climate researcher with a complex particle diffusion model that he wants to deploy his R scripts in a cloud environment for the use by other researchers within their model chains. Since his colleagues do not work with R, he decides to encapsulate his analyses within a web processing service using the WPS standard and the
WPS4R backend. His workflow is available publically via a simple interface that allows integration with WPS client available in desktop and online GIS, such as
ArcGIS, uDig or
OpenLayers.
B) Optimizing Computing Resources
A user wants to leverage the powerful resources of a server instead of his own limited computing resources, such as a laptop that needs to be used for other stuff or a mobile/tablet. Multiple users work with one WPS installation at the same time, constantly uploading scripts and testing them. This minimizes the installations of WPS servers, and can utilize the power of very sophisticated server configurations for many users.
Advantages: Processing power, less maintenance, integration of non-R algorithms (although direct R interfaces exist to
ArcGIS, GRASS, terraLib) with one interface.
Users: GIS users, (R users? Could also do cloud computing in R)
C) R Scripts within a Familiar Environment
A user wants to use sophisticated R scripts in his familiar environment (e.g. Arc Map) which does not natively integrate R but provides a WPS client.
D) WebGIS and Process Sharing
A user familiar with R is doing work with a WebGIS, such as uDig and OpenLayers. He wants to enhance its capabilities, so he decides to use R-scripts to write his own custom tools. He favours a WPS-R connector over local R connectors for different reasons, e.g. because it gives him an opportunity to share his custom tools with other users, who might neither stick to R nor to a particular WebGIS, because WPS allows different kinds of data types for the same input/output and because adding processes to a WPS enabled WebGIS don't require further set-up. The process is not only shared for use within the WPS but also transparently available for download and local execution.
E) Web Publishing of Processing Outputs
A user wants to publish sensor data, such as daily temperatures or river gauges through his website. Therefore he couples a SOS with a WPS. R-processes allow him to automate evaluation and data analysis. With some effort, his website frequently updates plots, maps and textual output by using
WPS4R.
F) Report Generation
[The following user story extends "Web Publishing of Processing Outputs".]
User A has one of the hardest jobs in research: generating daily, weekly, monthly and annual reports of data acqusition and results. Over the last years he has greatly automated to analysis process, but the starting of the analysis, the compilation of the report, and the publishing of the final results on the website still has to be done by hand. Luckily, the analysis is either based on R and partly wrapped into R scripts. Thanks to the great packages for generation of reports that R offers, such as
knitr (based on
Sweave), he can embed the analysis into a report document template and generate several differnt outputs, for example HTML and PDF, at the same time. He now ports this process to
WPS4R, which allows him to have reports generated on a powerful server and on-demand. Since the server is also in the same local network as the data storage servers, the report will even be compiled a lot faster and without taking away local computing resources! The download website simply checks how old the available report is and if it is outdated then it triggers the reports generation with a single execute request to a WPS, which provides an updated link to which website visitors can be pointed to.
Demonstrators
Use Case Demo 1*: Publishing processing results (story E; develop proof of concept with simple graphic with real time data embedded in HTML) [DN]
Roadmap and Tasks
The roadmap and tasks are managed in a Scrum-style Backlog, see
WPS4RBacklog.
- Topic created by: DanielNuest
- Topic created on: 2013-04-14