Fork me on GitHub

processing How to Run        wrench How to Build        note How to Develop                

52°North SOS Importer


The SOS Importer is a tool for importing observations from CSV files into a running SOS instance. Those CSV files can either be locally available or remotely (FTP support). The application makes use of the wizard design pattern which guides the user through different steps. These and their purposes are briefly characterized in the table below.

Wizard Module

The Wizard Module is the GUI wizard for creating the configurations (metadata about the CSV file).

Feeder Module

The Feeder Module uses the configurations created by the Wizard Module for importing the data into a running SOS instance.

Please check the how to run section for instructions to start the two modules.

Documentation

52n-sos-importer_screenshot_arranged-with-logo.png

HELP Click to view a larger version of the image.

License

The 52°North Sos-Importer is published under the GNU General Public License v2.0. The licenses of the dependencies are documented in another section.

Requirements

The SOS Importer requires JAVA 1.7+ and a running SOS instance (OGC SOS v1.0 or v2.0) to work. The wizard module requires a GUI capable system.

Terminology

Several sensor web specific terms are used within this topic:

  • Feature of Interest
  • Observed Property
  • Unit Of Measurement
  • Procedure or Sensor

If you are not familiar with them, please take a look at this explaining OGC tutorial. It's short and easy to understand.

Step Description

Step Description
1 Choose a CSV file from the file system to publish in a SOS instance. Alternatively you can also obtain a CSV file from a remote FTP server.
2 Provide a preview of the CSV file and select settings for parsing (e.g. which character is used for separating columns)
3 Display the CSV file in tabular format and assign metadata to each column (e.g. indicate that the second column consists of measured values). Offer customizable settings for parsing (e.g. for date/time patterns)
4 In case of more than one date/time, feature of interest, observed property, unit of measurement, sensor identifier or position has been identified in step 3, select the correct associations to the according measured value columns (e.g. state that date/time in column 1 belongs to the measured values in column 3 and date/time in column 2 belongs to the measured values in column 4). When there is exactly one appearance of a certain type, automatically assign this type to all measured values
5 Check available metadata for completeness and ask the user to add information in case something is missing (e.g. EPSG-code for positions)
6 When there is no metadata of a particular type present in the CSV file (e.g. sensor id), let the user provide this information (e.g. name and URI of this sensor)
7 Enter the URL of a Sensor Observation Service where measurements and sensor metadata in the CSV file shall be uploaded to
8 Summarize the results of the configuration process and provide means for importing the data into the specified SOS instance

Design

The SOS importer projects consists of three modules now:
  • wizard
  • feeder
  • configuration xml bindings
The first two are "applications" using the third to do their work: enabling the user to store metadata about his CSV file and import the contained data into a running SOS (OGC schema 1.0.0 or 2.0.0) instance for one time or repeatedly. In this process the wizard module is used to create the xml configuration documents. It depends on the 52n-sos-importer-bindings module to write the configuration files (the XML schema: stable, development). The feeder module reads this configuration file and the defined data file, creates the requests for inserting the data, and registers the defined sensors in the SOS if not done already. For communicating with the SOS some modules of the OxFramework are used:
  • oxf-sos-adapter
  • oxf-common
  • oxf-feature
  • oxf-adapter-api
  • 52n-oxf-xmlbeans
Thanks to the new structure of the OxFramework the number of modules and dependencies is reduced. The following figure shows this structure.

sos-importer-structure.png

Wizard Module

The wizard module contains the following major packages:
  • org.n52.sos.importer
    • controller - contains all the business logic including a StepController for each step in the workflow and the MainController which controls the flow of the application.
    • model - contains all data holding classes, each for one step and the overall XMLModel build using the 52n-sos-importer-bindings module.
    • view - contains all views (here: StepXPanel) including the MainFrame and the BackNextPanel. The sub packages contain special panels required for missing resources or re-used tabular views.
A good starting point for new developers is to take a look at the MainController.setStepController(StepController), BackNextController.nextButtonClicked(), and the StepController. During each step the XMLModel is updated MainController.updateModel().

All important constants are stored in org.n52.sos.importer.Constants.

Feeder Module

The feeder module contains the following major packages:
  • org.n52.sos.importer.feeder - contains classes Configuration and DataFile which offer means accessing the xml configuration and the csv data file, the application's main class Feeder which controls the application workflow, and the SensorObservationService class, which imports the data using the DataFile and Configuration classes for creating the required requests.
    • model - contains data holding classes for the resources like feature of interest, sensor, requests (insert observation, register/insert sensor)...
    • task - contains controllers required for one time and repeated feeding used by the central Feeder class.
A good starting point for new developers is to take a look at the Feeder.main(String[]) and follow the path through the code. When changing something regarding communication with the SOS and data parsing, take a look at: SensorObservationService, Configuration, and DataFile.

Configuration Schema

The configuration schema is used to store metadata about the data file and the import procedure. The following diagrams show the XML schema used within the wizard and feeder module for storing and re-using the metadata that is required to perform the import process. In other words, the configuration XML files use this schema. Theses configuration files are required to create messages which the SOS understands.

SosImportConfiguration

01_SosImportConfiguration.png

Each SosImportConfiguration contains three mandatory sections. These are DataFile, SosMetadta, and CsvMetadata. The section AdditionalMetadata is optional.

02_DataFile.png

The DataFile contains information about the file containing the observations. The attributes are described in the table below. The second section of the DataFile is the choice between LocalFile or RemoteFile. A LocalFile has a Path and two optional parameters. The Encoding is used while reading the file (e.g. Java would use the system default but the file is in other encoding. Example: UTF-8). The RegularExpresssionForAllowedFileNames is used to find files if Path is not a file but a directory (e.g. you might have a data folder with files from different sensors). A RemoteFile needs an URL and optional Credentials (HELP These are stored in plain text!). The last optional section of the DataFile is the IgnoreLineRegEx array. Here, you can define regular expressions to ignore lines which make problems during the import process or contain data that should not be imported.
Attribute Description Optional
referenceIsARegularExpression If set to TRUE the LocaleFile.Path or RemoteFile.URL contains a regular expression needing special handling before retrieving the data file. choice-no
useDateFromLastModifiedDate If set to TRUE the last modified date of the datafile will be used as date value for the observation of this data file. DONE
lastModifiedDelta If available and useDateFromLastModifiedDate is set to TRUE, the date value will be set to n days before last modified date. DONE
regExDateInfoInFileName If present, the contained regular expression will be applied to the file name of the datafile to extract date information in combination with the "dateInfoPattern". Hence, this pattern is used to extract the date string and the dateInfoPattern is used to convert this date string into valid date information. The pattern MUST contain one group that holds all date information! DONE
dateInfoPattern MUST be set, if regExDateInfoInFileName is set, for converting the date string into valid date information. Supported pattern elements: y, M, d, H, m, s, z, Z DONE
headerLine Identifies the header line. MUST be set in the case of having the header line repeatedly in the data file. DONE
sampleStartRegEx Identifies the beginning of an new sample in the data file. Requires the presence of the following attributes: sampleDateOffset, sampleDateExtractionRegEx, sampleDatePattern, sampleDataOffset. A "sample" is a single samplingrun having additional metadata like date information which is not contained in the lines. DONE
sampleDateOffset Defines the offset in lines between the first line of a sample and the line containing the date information. DONE
sampleDateExtractionRegEx Regular expression to extract the date information from the line containing the date information of the current sample. The expression MUST result in ONE group. This group will be parsed to a java.util.Date using sampleDatePattern attribute. DONE
sampleDatePattern Defines the pattern used to parse the date information of the current sample. DONE
sampleDataOffset Defines the offset in lines from sample beginning till the first lines with data. DONE
sampleSizeOffset Defines the offset in lines from sample beginning till the line containing the sample size in lines with data. DONE
sampleSizeRegEx Defines a regular expression to extract the sample size. The regular expression MUST result in ONE group which contains an integer value. DONE
sampleSizeDivisor Defines a divisor that is applied to the sample size. Can be used in cases the sample size is not giving the number of samples but the time span of the sample. The divisor is used to calculate the number of lines in a sample. DONE

SosMetadata

03_SosMetadata.png

The SosMetadata section has one optional attribute insertSweArrayObservationTimeoutBuffer, three mandatory sections URL, Offering with attribute generate, and Version. The section Binding is optional. The insertSweArrayObservationTimeoutBuffer is required if the import strategy SweArrayObservationWithSplitExtension (more details later) is used. It defines an additional timeout that's used when sending the InsertObservation requests to the SOS. The URL defines the service endpoint that receives the requests (e.g. Insert!|RegisterSensor, InsertObservation). The Offering" should contain the offering identifier to use, or its attribute =generate should be set to true. Than, the sensor identifier is used as offering identifier. The Version section defines the OGC specification version, that is understood by the SOS instance, e.g. 1.0.0, 2.0.0. The optional Binding section is required when selecting SOS version 2.0.0 and defines which binding should be used, e.g. SOAP, POX.

CsvMetadata

04_CsvMetadata.png

The CsvMetadata contains information for the CSV parsing. The mandatory sections DecimalSeparator, Parameter/CommentIndicator, Parameter/ColumnSeparator and Parameter/TextIndicator define, how to parse the raw data into columns and rows. The optional CsvParserClass is required if another CsvParser implementation than the default is used (see Extend CsvParser for more details). The FirstLineWithData defines how many lines should be skipped before the data content starts. The most complex and important section is the ColumnAssignments sections with contains 1..∞ Column sections.

05_Column.png

The Column contains two mandatory sections Number and Type. The Number indicates to which column in the data file this metadata is related to. Counting starts with 0. The Type indicates the column type. The following types are supported:

Type
Type Content of the column
DO_NOT_EXPORT Do not export this column. It should be ignored by the application. It's the default type.
MEASURED_VALUE The result of the performed observation, in most cases some value.
DATE_TIME The date or time or date and time of the performed observation.
POSITION The position of the performed observation.
FOI The feature of interest.
OBSERVED_PROPERTY The observed property.
UOM The Unit of measure using UCUM codes.
SENSOR The sensor id.

Some of these types require several Metadata elements, consisting of a Key and a Value. Currently supported values of Key

Key
Key Value
GROUP Indicates the membership of this column in a POSITION or DATE_TIME group.
TIME Not used.
TIME_DAY The day value for the time stamp for all observations in the related MEASURED_VALUE column.
TIME_HOUR The hour value for the time stamp for all observations in the related MEASURED_VALUE column.
TIME_MINUTE The minute value for the time stamp for all observations in the related MEASURED_VALUE column.
TIME_MONTH The month value for the time stamp for all observations in the related MEASURED_VALUE column.
TIME_SECOND The second value for the time stamp for all observations in the related MEASURED_VALUE column.
TIME_YEAR The year value for the time stamp for all observations in the related MEASURED_VALUE column.
TIME_ZONE The time zone value for the time stamp for all observations in the related MEASURED_VALUE column.
TYPE If MEASURED_VALUE column than these values are supported: NUMERIC, COUNT, BOOLEAN, TEXT. If DATE_TIME column than these COMBINATION, UNIX_TIME.
OTHER Not used.
PARSE_PATTERN Used to store the parse pattern of a POSITION or DATE_TIME column.
POSITION_ALTITUDE The altitude value for the positions for all observations in the related MEASURED_VALUE column.
POSITION_EPSG_CODE The EPSG code for the positions for all observations in the related MEASURED_VALUE column.
POSITION_LATITUDE The latitude value for the positions for all observations in the related MEASURED_VALUE column.
POSITION_LONGITUDE The longitude value for the positions for all observations in the related MEASURED_VALUE column.

The RelatedDateTimeGroup is required by an MEASURED_VALUE column and identifies all columns that contain information about the time stamp for an observation. The RelatedMeasuredValueColumn identifies the MEASURED_VALUE column for columns of other types, e.g. DATE_TIME, SENSOR, FOI. The Related(FOI|ObservedProperty|Sensor|UnitOfMeasurement) sections contain either a IdRef or a Number. The number denotes the Column that contains the value. The IdRef links to a Resource in the AdditionalMetadata section (HELP The value of IdRef is unique within the document and only for document internal links).

AdditionalMetadata

06_AdditionalMetadata.png

The AdditionalMetadata is the last of the four top level sections and it is optional. The intention is to provide additional metadata. These are generic Metadata elements, Resources like Sensor, ObservedProperty, FeatureOfInterest, UnitOfMeasurement and FOIPositions. The table below lists the supported values for the Metadata elements.

Key Value
IMPORT_STRATEGY The import strategy to use: SingleObservation (default strategy) or SweArrayObservationWithSplitExtension. The second one is only working if the SOS instance supports the SensorObservationServiceIVDocumentation#SplitDataArrayIntoObservations request extension. It results in better performance and less data transfered.
HUNK_SIZE Integer value defining the number of rows that should be combined in one SWEArrayObservation.
OTHER Not used. Maybe used by other CsvParser implementations.

07_Resource.png

A Resource is a sensor, observed property, feature of interest or unit of measurement and it has a unique ID within each configuration. A resource can have a Position (e.g. a feature of interest). The information can be entered manually or it can be generated from values in the same line of the data file (GeneratedResource).

08_GeneratedResource.png

The Number define the Column which content is used for the identifier and name generation. The order of the Numbers is important. The optional ConcatString is used to combine the values from the different columns. The URI defines a URI for all Resources, or it is used as prefixed for an generated URI, if useAsPrefix is set to true.

09_ManualResource.png

A ManualResource has a Name, URI (when useAsPrefix is set, the URI := URI + Name).

Road map

Legend:
  • MOVED TO... → denotes future versions and not implemented features
  • DONE → denotes achieved versions and implemented features

HELP Dear developer, please update our trello board accordingly!

MOVED TO... Open Features

HELP Please add feature requests to the feedback section or as new github issue with label enhancement.

  • PICK Allow regular expressions to describe dynamic directory/file names (repeated feeding)
  • PICK Generic web client for multiple protocol support
  • PICK Pushing new data directly into a SOS database through a database connection (via according SQL statements)
  • PICK Feed to multiple SOS instances
  • PICK Support SOAP binding (might be an OX-F task)
  • PICK Support KVP binding (might be an OX-F task)

MOVED TO... 0.5

  • PICK Switch to joda-time or java 8 DateTime API ⇒ switch to java 8 because of EOL
  • PICK handle failing insertobservations, e.g. store in common csv format and re-import during next run.
  • PICK Switch wizard to Java FX.
  • DONE Fixed issues

DONE 0.4

  • Code: github
  • Features
    • Rename Core module to Wizard
    • Support for SOS 2.0 incl. Binding definition
    • Start Screen offers button to see all dependency licenses
    • Support for sensors with multiple outputs
    • Introduced import strategies:
      • SingleObservation: Default strategy<
      • SweArrayObservationWithSplitExtension: %BR %Reads hunksize# lines and imports each time series using an SWEArrayObservation in combination with the SplitExtension of the 52North SOS implementation. Hence, this strategy works only in combination with 52North implementation. Other impl. might work, too, but not as expected. Hunksize and import strategy are both optional <AdditionalMetadata><Metadata> elements.
    • Support for date information extraction from file name using two new OPTIONAL attributes in element <DataFile>:
      • "regExDateInfoInFileName" for extracting date information from file names.
      • "dateInfoPattern" for parsing the date information into a java.util.Date.
    • Date information extraction from last modified date using two new OPTIONAL attributes:
      • "useDateFromLastModifiedDate" for enabling this feature
      • "lastModifiedDelta" for moving the date n days back (this attribute is OPTIONAL for this feature, too.)
    • Ignore lines with regular expressions feature: 0..infinity elements can be added to the element. Each element will be used as regular expression and applied to each line of the data file before parsing.
    • Handling of data files containing several sample runs. A sample run contains additional metadata like its size (number of performed measurements) and a date. The required attributes are:
      • "sampleStartRegEx" - the start of a new sample (MUST match the whole line).
      • "sampleDateOffset" - the offset of the line containing the date of the sample from the start line.
      • "sampleDateExtractionRegEx" - the regular expression to extract the date information from the line containing the date information of the current sample. The expression MUST result in ONE group. This group will be parsed to a java.util.Date using "sampleDatePattern" attribute.
      • "sampleDatePattern" - the pattern used to parse the date information of the current pattern.
      • "sampleDataOffset" - the offset in lines from sample beginning till the first lines with data.
      • "sampleSizeOffset" - the offset in lines from sample beginning till the line containing the sample size in lines with data.
      • "sampleSizeRegEx" - the regular expression to extract the sample size. The regular expression MUST result in ONE group which contains an integer value.
    • Setting of timeout buffer for the insertion of SweArrayObservations:
      With the attribute "insertSweArrayObservationTimeoutBuffer" of <SosMetadata" it is possible to define an additional timeout buffer for connect and socket timeout when using import strategy SweArrayObservationWithSplitExtension". Scale is in milliseconds, e.g. 1000 => 1s more connect and socket timeout. The size of this value is related to the set-up of the SOS server, importer, and the HUNK_SIZE value.
      The current OX-F SimpleHttpClient implementation uses a default value of 5s, hence setting this to 25,000 results in 30s connection and socket timeout.
    • More details can be found in the release notes.
  • Fixed Bugs/Issues
    • #06: Hardcoded time zone in test
    • #10: NPE during feeding if binding value is not set
    • #11: BadLocationException in the case of having empty lines in csv file
    • #20: Current GUI is broken when using sample based files with minor inconsistencies
    • #24: Fix/ignore line and column: Solved two NPEs while ignoring lines or columns
    • #25: Fix/timezone-bug-parse-timestamps: Solved bug while parsing time stamps
    • #NN: Fix bug with timestamps of sample files
    • #NN: Fix bug with incrementing lastline causing data loss
    • #NN: Fix bug with data files without headerline
    • #NN: NSAMParser: Fix bug with timestamp extraction
    • #NN: NSAMParser: Fix bug with skipLimit
    • #NN: NSAMParser: Fix bug with empty lines, line ending, time series encoding
    • #NN: fix/combinationpanel: On step 3 it was not possible to enter parse patterns for position and date & time
    • #NN: fix problem with textfield for CSV file when switching to German
    • #NN: fix problem with multiple sensors in CSV file and register sensor
    • 878
    • "Too many columns issue"
    • Fixed issues
  • Release files:
    • Feeder Module: Snapshots - HELP Download newest file ending with -bin.jar.
    • Wizard Module: Snapshots - HELP Download newest file ending with -bin.jar.

DONE 0.3

DONE 0.2

  • Release file: Core Module md5, Feeding Module md5
  • Code: Github
  • Features
    • DONE maven build
    • DONE multi language support
    • DONE xml configuration
    • DONE generation of FOIs and other data from columns
    • DONE feeding component

DONE 0.1

Contributors

Get Involved

You may first get in touch using the sensor web mailinglist (Mailman page including archive access, Forum view for browser addicted). In addition, you might follow the overall instruction about getting involved with 52°North which offers more than contributing as developer like designer, translator, writer, .... Your help is always welcome!

Project Funding

  • by the European FP7 research project EO2HEAVEN (co-funded by the European Commission under the grant agreement n°244100).
  • by the European FP7 research project GeoViQua (co-funded by the European Commission under the grant agreement n°265178).
  • by University of Leicester during 2014.

Users

How to Run

Module feeder

  1. Have datafile and configuration file ready.
  2. Open command line tool.
  3. Change to directory with 52n-sos-importer-feeder-$VERSION_NUMBER$-bin.jar
  4. Run java -jar 52n-sos-importer-feeder-$VERSION_NUMBER$-bin.jar to see the latest supported and required parameters like this:
    usage: java -jar Feeder.jar -c file [-d datafile] [-p period]
                options and arguments:
                -c file    : read the config file and start the import process
                -d datafile : OPTIONAL override of the datafile defined in config file
                -p period   : OPTIONAL time period in minutes for repeated feeding

  • Notes
    • Repeated Feeding
      • Element SosImportConfiguration::DataFile::LocalFile::Path
        • ...set to a directory → the repeated feeding implementation will always take the newest file (regarding java.io.File.lastModified()) and skip the current run if no new file is available.
        • ...set to a file → the repeated feeding implementation will always try to import all found observations from the datafile (HELP the 52°North SOS implementation prohibits inserting duplicate observations! → no problem when finding some in the data file!)

Module wizard

  1. Open command line tool.
  2. Change to directory with 52n-sos-importer-wizard-$VERSION_NUMBER$-bin.jar
  3. Run java -jar 52n-sos-importer-wizard-$VERSION_NUMBER$-bin.jar
  4. Follow the wizard to create a configuration file which can be used by the feeder module for repeated feeding or import the data once using the wizard (the second option requires the latest 52n-sos-importer-feeder-$VERSION_NUMBER$-bin.jar in the same folder like the 52n-sos-importer-wizard-$VERSION_NUMBER$-bin.jar.

Example tutorial

Follow this list of steps or this user guide using the demo data to get a first user experience.

  1. Download the example data to your computer.
  2. Start the application with javar -jar 52n-sos-importer-wizard-bin.jar
  3. Select the file you have just downloaded on step 1 and click next.
  4. Increase the value of Ignore data until line to 1 and click next.
  5. Select Date & Time and than Combination and than provide the following date parsing pattern: dd.MM.yyyy HH:mm and click next.
  6. 3x Select Measured Value and than Numeric Value and click next.
  7. Set UTC offset to 0. HELP If you want to import data reguarly, it is common sense to use UTC for timestamps.
  8. Feature of Interest: On the next view Add missing metadata select Set Identifier manually, click on the pen next to the Name label and enter any name and URI combination you can think of. Repeat this step 3 times (one for each time series). For time series #2 and #3 you can although select the previously entered value.
  9. Observed property: repeat as before but enter name and URI of the observed property for each timeseries, e.g. Propan, Water and Krypton.
  10. Unit of Measurement: repeat as before but enter name and URI of the unit of measurement for each timeseries, e.g. l,l,kg.
  11. Sensor: repeat as before but enter name and URI of the sensor for each timeseries, that performed the observations, e.g. propane-sensor, water-meter, crypro-graph.
  12. Define the position of the feature of interest manually, giving its coordinates.
  13. Next, specify the URL of the SOS instance, you want to import data into.
  14. Choose a folder to store the import configuration (for later re-use with the feeder module, for example).
  15. Specify the OGC specification version the SOS instance supports (We recommend to use 2.0.0!).
  16. When using the 52N SOS, you can specify to use the import strategy SweArrayObservation which improves the performance of the communication between feeder and SOS a lot.
  17. On the last step, you can view the log file, configuration file or start the import procedure. That's it, now you should be able to import data into a running SOS instance from CSV files, or similar.

Demo data

You can just download exmaple files to see how the application works:

Developers

Todos

MOVED TO... Github issues are used to organize the work.

How to Build

  1. Have jdk, maven, and git installed already.
  2. Due to some updates to the OX-Framework done during the SOS-Importer development, you might need to build the OX-F from the branch develop. Please check in the pom.xml the value of <oxf.version>. If it ends with -SNAPSHOT, continue here, else continue with step #3:
    ~$ git clone https://github.com/52North/OX-Framework.git
    ~$ cd OX-Framework
    ~/OX-Framework$ mvn install
    or this fork (please check for open pull requests!):
    ~$ git remote add eike https://github.com/EHJ-52n/OX-Framework.git
    ~/OX-Framework$ git fetch eike
    ~/OX-Framework$ git checkout -b eike-develop eike/develop
    ~/OX-Framework$ mvn clean instal
    .
  3. Checkout latest version of SOS-Importer with:
    ~$ git clone https://github.com/52North/sos-importer.git
    in a separate directory.
  4. Swtich to required branch (master for latest stable version; develop for latest development version) via ~/sos-importer$ git checkout devlop, for example.
  5. Set-Up the geotools repository like this (maven help regarding repositories):
    <repository>
       <id>osgeo</id>
       <name>Open Source Geospatial Foundation Repository</name>
       <url>http://download.osgeo.org/webdav/geotools/</url>
    </repository>
  6. Build SOS importer modules:
    ~/sos-importer$ mvn install
  7. Find the jar files here:
    • wizard: ~/52n-sos-importer/wizard/target/
    • feeder: ~/52n-sos-importer/feeder/target/

URLs Comment
git: https://github.com/52North/sos-importer/tree/develop latest development version
git: https://github.com/52North/sos-importer/tree/master latest stable version
git: https://github.com/52North/sos-importer/releases/tag/52n-sos-importer-0.4.0 release 0.4.0
git: https://github.com/52North/sos-importer/releases/tag/52n-sos-importer-0.3.0 release 0.3.0
svn: https://svn.52north.org/svn/swe/incubation/SOS-importer/tags/52n-sos-importer-0.2.0/
git: https://github.com/52North/sos-importer/releases/tag/52n-sos-importer-0.2.0
release 0.2.0
svn: https://svn.52north.org/svn/swe/incubation/SOS-importer/tags/52n-sos-importer-0.1.0/
git: https://github.com/52North/sos-importer/releases/tag/52n-sos-importer-0.1.0
release 0.1.0

Code Repository

Dependencies

Extend CsvParser

For providing your own CsvParser implementation

Since version 0.4.0, it is possible to implement your own CsvParser type, if the current generic CSV parser implementation is not sufficient for your use case. Currently, one additional parser is implemented. The NSAMParser is able to handle CSV files that grow not from top to down but from left to right.

To get your own parser implementation working, you need to implement the CsvParser interface (see the next Figure for more details).

sos-importer_csvparser.png

In addition, you need to add <CsvParserClass> in your configuration to <CsvMetadata>. The class that MUST be used for parsing the data file. The interface org.n52.sos.importer.feeder.CsvParser MUST be implemented. The class name MUST contain the fully qualified package name and a zero-argument constructor MUST be provided.

The CsvParser.init(..) is called after the constructor and should result in a ready-to-use parser instance. CsvParser.readNext() returns the next "line" of values that should be processed as String[]. An IOException could be thrown if something unexpected happens during the read operation. The CsvParser.getSkipLimit() should return 0, if number of lines == number of observations, or the difference between line number and line index.

Troubleshooting/Bugs

If you have any problems, please check the issues section for the sos importer first:

Feedback

  • Add user interface for file output (generates request as separate XML files instead of sending them) for testing and inspection before inserting data. This requires the OX-F to be extended with functionality to create and retrieve request documents. -- DanielNuest - 2013-01-07
  • Step 2:
    • The highlighting for the selected lines is a bit misleading. I think it is better to highlight the lines that will be used, so if I say "Ignore data until line 3", then the first two lines should be ignored and all others marked as selected. -- DanielNuest - 2012-11-06
    • I think a "first line contain column names" would be a good option > it has the same effect than ignoring everything until line 2, but might be more easily understandable by users. And clicking that checkbox can effectively just set the value of the "ignore lines" input field. -- DanielNuest - 2012-11-06
  • Step 3:
    • Ignored lines should not be shown in Step 3 -- DanielNuest - 2012-11-06
  • Why can I not resize the window? -- DanielNuest - 2012-11-06
Topic attachments
I Attachment Action Size Date Who Comment
01_SosImportConfiguration.pngpng 01_SosImportConfiguration.png manage 2 K 29 Apr 2015 - 08:58 EikeJuerrens version 0.4.0
02_DataFile.pngpng 02_DataFile.png manage 9 K 29 Apr 2015 - 08:58 EikeJuerrens version 0.4.0
03_SosMetadata.pngpng 03_SosMetadata.png manage 2 K 29 Apr 2015 - 08:58 EikeJuerrens version 0.4.0
04_CsvMetadata.pngpng 04_CsvMetadata.png manage 4 K 29 Apr 2015 - 08:58 EikeJuerrens version 0.4.0
05_Column.pngpng 05_Column.png manage 9 K 29 Apr 2015 - 08:58 EikeJuerrens version 0.4.0
06_AdditionalMetadata.pngpng 06_AdditionalMetadata.png manage 8 K 29 Apr 2015 - 08:58 EikeJuerrens version 0.4.0
07_Resource.pngpng 07_Resource.png manage 2 K 29 Apr 2015 - 08:58 EikeJuerrens version 0.4.0
08_GeneratedResource.pngpng 08_GeneratedResource.png manage 6 K 29 Apr 2015 - 09:03 EikeJuerrens version 0.4.0 - removed duplicate content
09_ManualResource.pngpng 09_ManualResource.png manage 4 K 29 Apr 2015 - 09:03 EikeJuerrens version 0.4.0 - removed duplicate content
52n-sos-importer-logo.pngpng 52n-sos-importer-logo.png manage 5 K 10 Aug 2012 - 15:21 EikeJuerrens logo init
52n-sos-importer_screenshot_arranged-with-logo.pngpng 52n-sos-importer_screenshot_arranged-with-logo.png manage 46 K 12 May 2015 - 10:15 EikeJuerrens updated for version 0.4.0 - removed map view
SOS-Importer.jnlpjnlp SOS-Importer.jnlp manage 1 K 22 Feb 2012 - 10:04 EikeJuerrens 52n-sensorweb-sos-importer webstart package
THIRD-PARTY.txttxt THIRD-PARTY.txt manage 10 K 12 May 2015 - 11:00 EikeJuerrens 0.4.0 - release
example-data.csvcsv example-data.csv manage 361 K 20 Feb 2015 - 10:33 EikeJuerrens example CSV file with 3 time series
sos-importer-structure.pngpng sos-importer-structure.png manage 16 K 27 Apr 2015 - 12:46 EikeJuerrens global overview v0.4.0 - fixed background
sos-importer_csvparser.pngpng sos-importer_csvparser.png manage 3 K 27 Apr 2015 - 10:40 EikeJuerrens initial version
Topic revision: r64 - 14 Jun 2016 08:56:33, EikeJuerrens - This page was cached on 20 Jul 2016 - 11:42.

This site is powered by FoswikiCopyright &© by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Wiki? Send feedback