Trajectory Analysis in R

Student: Jinlong Yang

Mentors: Edzer Pebesma (pebesma@52north.org), Daniel Nüst (d.nuest@52north.org)

Project Description

In this project, I will be developing classes and methods that are specifically for trajectory analysis in the R language. The methods will be building on classes in spacetime package, and will include computation of trajectory attributes, trajectory selection, trajectory aggregation (by time/space/id), sampling, visualization, and interpolation. I will also explore other common analysis methods for trajectory analysis. In addition, I intend to develop a method that can convert generic trajectory data in various formats into STTDF object in the spacetime package.

The project is being managed at R-Forge.org. The code can be accessed from this link. For those of you who want to test the package on your R, you may get anonymous subversion access from: svn checkout svn://r-forge.r-project.org/svnroot/spacetime/.

Weekly Report

Week 1 - June 24, 2013

Status:

- First (unsuccessful) take on developing spatial and temporal selection operations. These operations will be revised later to make use of classes and methods that currently available in spacetime and sp packages;
- Set up subversion control in RStudio and start to commit;
(I started to use Mac from this week. The subversion on my Mac is working, but not on my PC. I will figure out how to set up subversion control on PC later next week);
- Studied the overlay and aggregate vignettes from sp package (see links below). Learned the operations of space, time, and space-time overlay/aggregation realized by over() and aggregate() methods;
http://cran.r-project.org/web/packages/sp/vignettes/over.pdf
http://cran.r-project.org/web/packages/spacetime/vignettes/sto.pdf
- Studied section "SoftwareSystems", "The S3 object system", and "The S4 object system" from
Advanced R development: making reusable code (link: https://github.com/hadley/devtools/wiki). Learned the major differences between S3 and S4 objects (formal class definitions & multiple dispatch) and how to create/modify S4 objects.
- Learned how do.call and lapply works from spacetime/R/STTDF-methods.R;
- Learned how STTDF objects are coerced from ltraj objects from spacetime/R/coerce.R.

Problems:
- Need to become more familiar with concepts and terms in R package development;
- Understand ltraj objects.

Tasks for next week:

- Study read.R in the demo from trajectories;
- First take on selection operations based on "[";
- Write methods that compute speed, turning angle, line length based on read.R.

Week 2 - July 1, 2013

Status:

- learn read.R from spacetime package
- write sttdf_computation.R (spacetime/pkg/trajectories/demo/);
1) calculate displacement using LineLength() from sp package;
2) calculate time passed between consecutive points;
3) caluclate speed (Unit: m/s);
4) calculate relative angle change between consecutive points;
5) calculate absolute angle change between consecutive points;
6) (naive) transporation mode detection;
7) calculation of trajectory properties (e.g., elevation change)?
- First take on STTDF_selection.R based on "[" (spacetime/pkg/trajectories/demo/).

Problems:

- How to handle a GPS point record If time interval is 0 between two consecutive points? Delete that piont or keep that but set dist, timeLapsed, speed, etc to NA?
- How to make connections between points (i.e., rows) from sttdf@data and STIs from sttdf@traj?
- Where to store the attributes of a burst (e.g., total length, total time, average speed)?
- Whether to make use of SpatialLines class and how (see examples in STTDF_selection.R).

Tasks for next week:

- Put STTDF_computation.R into a function STItoSTTDF;
- Start to write documentation;
- Write a function - over() for trajectories package:
select all trajs in an area - SpatialLine objects;
select all trajs between 2008 - 2009;
- Verification of the data (space, time);
- Selection functions that fliter data based on, for example, certain time interval.

Week 3 - July 8, 2013

Status:

- Put STTDF_computation.R into a method at trajectries/demo/STItoSTTDF.R;
- Modified read.R as trajectories/demo/geolife_reader.R with id and trip added (if id or trip is missing, dummy id or trip will be added);
- Calculated the elevation change between consecutive GPS points in trajectories/demo/STItoSTTDF.R (potentially can be applied to other properties of trajectories data);
- Calculated the abosoluate angles between consecutive GPS points using gzAzimuth() from maptools package (see STItoSTTDF.R);
- Start writing documentation for methods developed.

Problems:

- What is STbox?
- How to avoid using for loop in R (e.g., Line ##72 in STItoSTTDF.R)?
- What to expect about the input format of trajectory data? (Currently work with GeoLife dataset, but in general should we assume data comes in format of dataframe(s) with each column as a property?).

Tasks for next week:

- Learn how to build/install/check package;

- Move all methods to R directory (instead of Demo directory);

- Figure out why keyword section are missing;
- Export STItoSTTDF to trajectories/NAMESPACE;
- Move the test section to demo directory;
- Put all testing in the example code in the R documentation file;
- Create a toy data set for example section of Rd file.

Week 4 - July 15, 2013

Status:

- Learned how to build/install/check package;

- Moved all methods to R directory;

- Exported STItoSTTDF to trajectories/NAMESPACE;
- Moved the test section to demo directory;
- Put all testing in the example code in the R documentation file;
- Created a toy data set for example section of Rd file;

- Made calculation of elevation change faster by replacing for loops with diff() functions;

- Added ID and trip as attributes of STI objects in the traj slot of STTDF object.

Problems:

- Need to find a sample dataset with proper license for demostration;

- How to develop over for STTDF and STTDF;

- How to develop over for STTDF with Spatial.

Tasks for next week:

- Start with a simpler scenario (a traj with a polygon);
- Coerce STI(STTDF) into SpatialLines to make use of over() for SpatialLines;
- over.sttdf.Spatial(Polygon, Line, Grid, Point) (without time);
- Two selection level:
spatial selection (on traj level) (e.g., a STTDF and a polygon);
cutting trajs (on point level) (e.g., cut out trajs that overlapping with a polygon).

Week 5 - July 22 2013

Status:

- Implemented the spatial overlay between STTDF and SpatialPolygons;
- Wrote a function that coerces STTDF objects to SpatialLines objects;
- Enables spatial overlay between STTDF and objects from Spatial classes from sp package;
- Wrote a function that cutting trajs (on point level) based on its overlay with a SpatialPolygons object.

Problems:

- When coerce STTDF objects into SpatialLines object, the attribute data in the data slot of STTDF objects is lost;
- When a trajectory is cutted into more than one piece by the overlay operation with a polygon, the over() function won't automatically detect the breaking points that divide the trajectory into multiple segments;
- Potential bug in the trajectory cutting function.

Tasks for next week:

- Store the attribute data of STTDF objects when coerce it into SpatialLines;
- Reverse the coersion of STTDF objects to SpatialLines objects;
- Testing the bug in the trajectory cutting function.

Week 6 - July 29 2013

Status:

- Impletmented the plot function for STI and STTDF objects;
- Stored attribute data of STTDF objects in the data slot of STTDF and the attribute(s) of each individual point can be linked to its spatial and temporal property by its index;
- Resolved part of the bugs in the trajectory cutting function.

Problems:

- Need to remove the markers for bounding box when plot STTDF objects.

Tasks for next week:

- Impletment summary function for STI and STTDF objects;
- Impletment selecting function for STI and STTDF objects.

Week 7 - Aug 5, 2013

Status:

- Impletmented the summary function for STI and STTDF objects;
- Added test directory and move all the demos into that directory;
- House-cleaning for the working directory.

Problems:

- Need to rename and modify the cut.STTDF.SpatialPolygons.R into cut.R;
- Need an extra container to store attribute data when doing the coersion: STTDF -> SpatialLines -> STTDF.

Tasks for next week:

- Impletment the selecting function for STTDF;
- Impletment the temporal sampling for STTDF;
- Modify the traj_stats into aggreate.STTDF().

Week 8 - Aug 12, 2013

Status:

- Improved the summary() function such that it displays NA for average elevation when the eleveation data is unavailable;
- Replaced the "+" with "." as the drawing device placeholder in the plot() function;
- Developed the crop() function for STTDF (overlap with a SpatialPolygons object), a single STTDF object with be returned as a result (on individual-point level;
- Bug fixed in STItoSpatialLinnes() function.

Problems:

- The returned STTDF object from crop() function need to be splitted into multiple STTDF objects with each of them being a continuous trajectory;
- Need to remove the placeholder for the drawing device in the plot() function.

Tasks for next week:

- Impletment the select function for STTDF;
- Modify the traj_stats into aggreate.STTDF().

Week 9 - Aug 19, 2013

Status:

- Improved crop() function such that, for an STTDF object, discontinued trip(s) cropped by a polygon is divided into several indivudal sub-trips with each trip being continuous;
- Impletement crop_index() function. This function subsets an STTDF object by the index;
- Reimpletemented the crop_demo to demonstrate the newer version of the crop() function.

Problems:

- When use crop() function to subset an STTDF object, the original break points of STI objects are missing.

- Can't find a solution to remove the placeholder for the plot() function.

Tasks for next week:

- Impletment the select() function for STTDF;
- Fix the crop() function by adding the original break points;
- Impletment the aggregate() function for STTDF.

Week 10 - Aug 26, 2013

Status:

- First take on the aggregate() function that aggrates the data from an STTDF object on an hourly level;
- Revised the crop() function to include the orginal break points (not working yet);
- First take on the merge function to aggreate an STTDF object based on a column.

Problems:

- In the crop() function, when insert the original break point, the index used to distinguish different trips is messed-up;
- Having trouble with converting the time stamps from STTDF objects into POSIXct objects.

Tasks for next week:

- Completely rewrite the crop() function to include the original break points;
- Figure out how to convert the time stamps collected from STTDF objects into POSIXct objects;
- Test the package with testthat package;
- Add more info to the project Wiki page.

Week 11 - Sept. 2, 2013

Status:

- Impletmented aggregate() which allows aggregating an STTDF object by hour;
- Completely rewrote the crop() function such that it goes through each STI object in the STTDF object and crop the trajectories. Therefore, the original break points are included;
- First take on merge() and sample() functions (currently not working);
- Modified the demo for plot(), summary(), crop(), aggregate to accommodate the changes made this week. A trajectory map generated by plot() function is shown below. In the map, each continuous trip (i.e., an STI object from the STTDF object is visualized by a distinct color).

Problems:

- Need to seek another solution to link the data in the data slot of the STTDF object to the spatial points and time stamps of the STTDF object. Currently, the linkage is made by indexing and counting;
- The plot function cannot be properly loaded when load the trajectories package. Need to manually run the plot.R to make the plot() function work for STTDF objects.

Tasks for next week:

- Expand the aggregate() function to support data aggregation on different time units;
- Write demo based on the EnviroCar dataset;
- Impletement keep working on the sample() and merge() functions.

Week 12 - Sept. 9, 2013

Status:

- Updated the content in the /tests folder such that all the demo scripts are being tested when check&build the package;
- Moved short demos to the example section of the Rd files;
- Made a new sample dataset from GeoLife dataset as part of the pacakge;

- Modified the read.R to help user test the package with a larger portion of dataset from GeoLife.

Problems:

- None.

Tasks for next week:

- Create vignette-style of the demos with figures for the functions developed so far;
- Keep working on incompeted functions.

Week 13 - Sept. 16, 2013

Status:

- Bug fixed for crop() function that caused error when only a single attribute is available;
- Adapted all testing scripts into the tests folder with expected outcomes generated;
- Created figures for vignettes ( link at r-forge.org);

- Revised the geolife_reader()function such that users can test the package with part/full set of geolife data.

Problems:

- New bug found in the crop() function. Trajectories missing when do the crop.

Tasks for next week:

- Write vignette and wrap-up;
- Fix the bug in crop() function.

Week 14 - Sept. 23, 2013

Status:

- Bug fixed for crop() function that only trajectories that entirely falls into the polygon are selected;
- Directory and warning message cleaning;
- Vignette updated;
- Made the final post;
- Namespace updated;.

Problems:

- None

Tasks for future:

- Working on the selection() function;
- Plot() function that visualizes the attribute of the trajectories;
- etc.

About

My name is Jinlong Yang. I am a second-year master student at Penn State Geography. My research interests center on Spatial Cognition and GeoVisualization.

Original Project Idea

Explanation: The current handling of trajectory data in R is scattered, partial, and strongly linked to particular domains (mainly ecology); generic data structures and methods (see SpatioTemporal task view on CRAN). This GSoC project will implement and/or improve generic classes, methods and tools for handling and analyzing large-scale trajectory data with R.

Expected results: Classes and methods for handling trajectories in R, building on the classes for handling spatio-temporal data in R in package spacetime. Low-level manipulation will involve selection (by individual, by trip), aggregation (to a lower spatial and/or temporal resolution), sampling, simple (linear) interpolation and visualisation (by animation and by conditioning plots). Analysis methods will include computation of attributes (speed, direction); computing distances between trajectories; simple statistical methods for interpolation such as biased random walk. Code should be integrated in spacetime or depend on it; package should be ready for submission to CRAN.

Community and Code License: Geostatistics and Geoprocessing, GPL 2.0

Topic attachments
I Attachment Action Size Date Who Comment
PNGpng traj_plot_demo.png manage 61.7 K 2013-09-03 - 07:52 UnknownUser traj_plot_demo
Tags:
create new tag
, view all tags
Topic revision: r17 - 2013-09-24 - 05:55:07 - JinlongYang

 
  • Search: 
This site is powered by the TWiki collaboration platform Copyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback