By Itamar Haber, InfoWorld |

About |

Emerging tech dissected by technologists

Build geospatial apps with Redis

Learn how to simplify the development of location-based apps with Redis’ new geospatial indexing, sets, and operations

For an increasing number of applications, tracking location is essential. A social application might connect users based on location. A hospitality or travel application might use the user’s location to point out interesting sights or provide custom itineraries. A sensor application might store and analyze data that is both geospatial and time series, to trigger an action like detecting patterns, outliers, and anomalies.

Further, as geospatial technology matures, location-based applications are evolving from mainly mapping applications into sophisticated, cutting-edge programs that process and analyze millions of data points from mobile users, sensor networks, IoT devices, and other sources. The world is in constant motion, and our apps are beginning to catch on.

Location data presents an interesting challenge for the developer because querying it or performing calculations of position and distance have to consider longitude (x), latitude (y), and sometimes even elevation (z). The multidimensional nature of location data requires optimized mechanisms to process it -- treating it as merely integers is highly inefficient. If the database, whether an RDBMS or a NoSQL store, lacks the capabilities for handling geospatial data, application programmers have to do the extra work of preprocessing the data, or they have to build in logic that treats the data as geospatial.

Processing geospatial data is also a real-time, big data challenge. Applications that use and manage geospatial data must serve, at minimal latency, a high number of requests for location (“Where are you?”), updates to location (“I am here”), and searches for data by location (“Who or what is nearby?”).

Simple reads (fetch location) and writes (update location) are challenging at scale. Searching further compounds the challenge. The key to satisfying the above requirements is maintaining effective indexes for the data. An effective index is one that can facilitate speedy searches and isn’t expensive to maintain (in terms of memory and compute power).

The characteristics and performance of Redis make it an excellent fit for location-based applications. All that was missing was native support for geolocation data. Starting with version 3.2, however, Redis comes with geospatial indexing built in. Developers of applications that rely on geospatial data can now look to Redis to store, process, and analyze it -- with all of the speed and simplicity they’ve come to expect from Redis in other applications.

Brief intro to Redis

Redis is an in-memory data structure store that's commonly used as a database, a cache, and a message broker. Data structures in Redis are like Lego building blocks, helping developers achieve specific functionality with minimal complexity. Redis also minimizes network overhead and latency because operations are executed extremely efficiently in memory, right next to where the data is stored.

Redis data structures include Hashes, Sets, Sorted Sets, Lists, Strings, Bitmaps, and HyperLogLogs. These are highly optimized, each providing specialized commands that help you execute complex functionality with very little code. These data structures make Redis extremely powerful and allow Redis-based applications to handle extreme volumes of operations at very low latency.

Sorted Sets are particularly significant. Unique to Redis, they add an ordered view to members, sorted by scores. Sorted Sets are tremendously advantageous for processing data like bids, ranks, user points, and time stamps -- allowing analysis to be performed a couple of orders of magnitude faster compared to ordinary key/value or NoSQL stores.

Geospatial indexing is implemented in Redis using Sorted Sets as the underlying data structure, but with on-the-fly encoding and decoding of location data and new APIs. This means that location-specific indexing, searching, and sorting can all be offloaded to Redis, with very few lines of code and very little effort, using built-in commands like GEOADD, GEODIST, GEORADIUS, and GEORADIUSBYMEMBER.

When you combine this geospatial support with other Redis capabilities, some interesting functionality becomes extremely simple to implement. For example, by melding the new Geo Sets and PubSub, it is nearly trivial to set up a real-time tracking system in which every update to a member’s position is sent to all interested parties (think of a running or biking group where you want to track group members locations in real time).

The Geo Set

The Geo Set is the basis for working with geospatial data in Redis -- it is a data structure that is specialized for managing geospatial indices. Each Geo Set is made up of one or more members, with each member consisting of a unique identifier and a longitude/latitude pair. Similar to all of the data structures in Redis, Geo Sets are manipulated and queried using a subset of simple-to-use and at the same time highly optimal commands.

Internally, Geo Sets are implemented with a Sorted Set. Sorted Sets exhibit a good space-time balance by consuming a linear amount of RAM while providing logarithmic computing complexity for most operations.

Creating and adding to the index

The Redis command for adding members to a geospatial index is called GEOADD. This command is used both for creating new sets and for adding members. The following example, illustrated from the command line and the Node Redis client, demonstrates its use.

Redis command example:

GEOADD locations 10.9971645 45.4435245 Romeo

Node Redis example:

redis.geoadd(‘locations’, ‘10.9971645’, ‘45.4435245’, ‘Romeo’);

The above tells Redis to use a Geo Set called locations for storing the coordinates of the member named Romeo. In case the locations data structure doesn’t exist, it will first be created by Redis. The new member will be added to the index if and only if it does not exist in the set.

It is also possible to add multiple members to the index with a single call to GEOADD. By batching multiple operations in a single command, this form of invocation reduces the load on the database and the network.

Redis command example:

GEOADD locations 10.9971645 45.4435245 Mercutio 10.9962165 45.4419226 Juliet

Node Redis example:

redis.geoadd(‘locations’, ‘10.9971645’, ‘45.4435245’, ‘Mercutio’, ‘10.9962165’, ’45.4419226’, ‘Juliet’);

Updating the index

After a member and its coordinates have been recorded in the index, Redis allows you to update that member’s location. Updating members in a Geo Set is done by calling the same command used for adding them, namely GEOADD. When called with existing members, GEOADD simply updates the spatial data that is associated with each member with the new values. Therefore, once Romeo exits the house to begin his evening stroll, his updated location can be recorded with the following.

Redis command example:

GEOADD locations 10.999216 45.4432923 Romeo

Node Redis example:

redis.geoadd(‘locations’, ‘10.999216’, ‘45.4432923’, ‘Romeo’);

Removing members from the index

After having been added to the index, members may need to be deleted from it at a later time. To facilitate the deletion of members from the Geo Set, Redis provides the ZREM command. To delete a member (or members) from the set, ZREM is called with the appropriate key name followed by the members to be deleted from it.

Redis command example:

ZREM locations Mercutio

Node Redis example:

redis.zrem(‘locations’, ‘Mercutio’);

The geospatial index may be deleted entirely. Since the index is stored as a Redis key, the DEL command can be used to delete it.

Reading from the index

The data in a Geo Set index can be read in several ways. First, the index can be used for scanning through all of the members in it, whether in one big batch or in several smaller chunks. Redis provides two commands that can be used for iterating through the entire index: ZRANGE and ZSCAN. However, because these can be used to cover all of the indexed elements, this type of access to the data is mostly reserved for offline, noncritical operations (for example, ETL and reporting processes).

The second type of read access to the index is for fetching members’ coordinates, and to achieve that Redis provides two commands. The first of these commands is GEOPOS, which returns the coordinates for a given member in a Geo Set. Assuming that Romeo is keeping with his walk, the answer regarding his current whereabouts is provided by executing the following.

Redis command example:

GEOPOS locations Romeo

1) 1) 10.999164
2) 45.442681

Node Redis example:

redis.geopos(‘locations’, ‘Romeo’, function(err, reply) {
});

In the example above, the first line is the query, whereas the following lines are the database’s response. Redis provides another command called GEOHASH that reports the locations of members. While both practically perform the same function, the difference between them is that the output of GEOHASH is encoded as a standard geohash (more on geohashes below).

Another use for data that is stored in the index is computing distances between members. For any two members in the Geo Set, the GEODIST command will compute and return the distance between them.

Searching the index

The last and perhaps most useful type of read access that the geospatial index enables is searching the data by its location. The most common example of such searches is finding indexed members within a certain distance of a given location. For that purpose, Redis provides the GEORADIUS command.

As the name suggests, GEORADIUS performs a search within a circle given by its center and its radius and returns the members that fall inside it. Another Redis command, GEORADIUSBYMEMBER, serves the same purpose but accepts one of the indexed members as the circle’s center. The following is an example of such a search.

Redis command example:

GEORADIUSBYMEMBER locations Romeo 100 m

1) “Juliet”

Node Redis example:

redis.georadiusbymember(‘locations’, ‘Romeo’, ‘100’, ‘m’, function(err, reply) {
});

The search command also supports sorting the replies from nearest to furthest (the default) or vice versa, as well as returning the location and distance of each reply. Redis also allows storing the reply in another Set for further processing (such as paging and Set operations).

Redis for geospatial data

The simplicity of implementing location-based functionality in Redis means that you can not only handle the flood of geodata easily, but also implement intelligence on top of simple processing. For example, the built-in radius query can help you implement simple functionality like “nearby items of interest” without swamping your user or your application with too many choices. Set intersection operations can help you isolate “items of interest” based on multiple filters like geographic location, user characteristics, and preferences.

Another benefit in efficiency accrues from the way Redis Geo sets are implemented. Geo Sets in Redis are simply another version of the powerful Sorted Sets, with the key difference that Geo Sets use the geohash of a location’s longitude and latitude as its score (plus on-the-fly encoding and decoding that is transparent to the user). Geohashing, a system invented by Gustavo Niemeyer, also makes it possible to search extremely efficiently. The entire location coordinate set doesn’t need to be compared every single time distance is computed; the representation ensures that searches can be limited easily and therefore become both time and space efficient.

Other libraries available add interesting capabilities, like including elevation in calculations. For example, you may be tracking a drone or group of drones at different elevations, carrying sensors that measure wind conditions or temperature differences in a location. The required combination of Sets and Sorted Sets is provided in this xyzsets API in the Geo Lua library available on GitHub.

Path length calculations, typically needed for navigating between waypoints to particular destinations, can be easily accomplished with the geopathlen API. Real-time tracking is easily implemented with this location updates API.

If your application uses location data in any way, consider offloading a lot of the hard work to Redis. For very large data sets, it might be more cost-effective to use Redis on Flash, which uses a combination of RAM and flash memory to deliver the extreme throughput and submillisecond latencies characteristic of Redis. For more technical details on using Redis for geospatial data, including geohash searching and advanced capabilities with Lua, see the Redis for Geospatial Data whitepaper.

Itamar Haber is chief developer advocate for Redis Labs.

New Tech Forum provides a venue to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to newtechforum@infoworld.com.

Next read this: