New Job Postings

By Mon, May 20 2013 at 03:15PM EDT From OpenGeo

hiringOpenGeo is looking for talented people to join our team. We offer interesting technical work, competitive salaries, great benefits, and a fantastic working environment. Most importantly we challenge our employees to build the best open source and interoperable tools for spatial data on the web. We added a few new posts this week, if any look like a fit for you, please apply!

Here’s a list of our open positions:

UX Developer -  We’re seeking a talented user experience developer to design and implement creative user interfaces for our innovative open source geospatial software.

Support Manager -  OpenGeo is looking for a support manager to ensure that customers large and small are familiarized with our software, properly trained in its function, and supported if anything should go wrong. The ability to think quickly and communicate clearly in a fast-paced environment is essential. Enthusiastic problem-solving skills and a desire to be engaged at all levels of a problem are even better.

Software Project Manager -  OpenGeo is seeking a skilled Software Project Manager to help us bring open source software to governments, commercial enterprises, NGOs, and other organizations around the world.

Java Developer - OpenGeo is seeking skilled software engineers interested in helping us bring open source software to organizations around the world. Our team improves the open source components underlying the OpenGeo Suite, allowing a wide variety of customers to share and edit data using open standards.

Front End Developer -  We’re looking for someone who is ready to work with peers in design and engineering to create pixel-perfect interfaces across a range of projects and products. You’ll own the code-base, work on the hard problems, build your ideas into reality, and help determine best practices throughout our organization.

Sales Account Manager – Our current (and future) clients are looking to open source to solve their spatial IT needs. Our account managers help commercial enterprises and federal clients use our innovative, open source geospatial software as efficiently and effectively as possible, allowing them to get more than ever out of their geospatial instances.

Here’s the full list, please apply and/or spread the word!

GeoServer 2.3.2 released

By Sun, May 19 2013 at 11:33PM EDT From GeoServer Blog

The GeoServer team is pleased to announce the release of GeoServer 2.3.2 for download.

  • This release includes and is made in conjunction with GeoTools 9.2.
  • The INSPIRE plugin has now graduated to extension and is included in this release. This plugin adds WMS and WFS capabilities support for metadata required for compliance with the European INSPIRE directive.
  • The application schema support (app-schema) support plugin now enables joining by default for data sources that support it.
  • Fixed transformation problems with projections based on Hotine Oblique Mercator (variant B) (for example Swiss CH1903 / LV03)
  • Fixed WFS lockups when a WFS 1.1 GetFeature is providing a schema referring back to the same server DescribeFeatureType
  • A new option to limit the file browser to the data directory, geared towards high security/multi-tenant environments

More details can be found in the GeoServer 2.3.2 Release Notes.

OpenGeo Emerges

By Wed, May 15 2013 at 02:10PM EDT From OpenGeo

This week OpenGeo took an exciting and important step forward as an organization. We’ve taken on investment and spun out from OpenPlans, our long time parent organization, to establish ourselves as an independent company. Our growth and this successful step out on our own are the result of our amazing team and the success of open source geospatial software that we’ve been working on for over ten years.

Vanedge Capital, a Vancouver-based venture capital firm, led the Series A round of investment that made this possible. We are truly excited to begin a partnership with Vanedge, an innovative fund led by partners who know how to grow and manage software technology companies.

This investment provides the capital we need to meet our objectives and continue to develop innovative technologies. If you’re a regular reader of this blog or have seen us lately at conferences or events, you know about the ambitious projects we’ve been working on: through-the-web-processing, breaking out of the GIS work-flow with Spatial IT, geospatial web-analytics and distributed versioning for geospatial – to name a few. This type of development requires not just the strong technical skills and forward-looking leadership that our team has, but it also requires resources, which Vanedge’s investment provides.

This investment also allows us to achieve our long-planned separation from OpenPlans, which founded and incubated us. We are grateful for the support and vision of OpenPlans over the years. And, since OpenPlans remains an investor in our new company, we’re looking forward to our continued partnership with them.

Our mission remains the same: to build the highest quality software for location and mapping, available to all. This investment gives us a stronger base of resources to support the open source communities we work with. We remain committed to the open source principles of collaboration, transparency, and freedom. We’ll be doing even more to develop the best geospatial tools while supporting the open source communities and our customers alike.

Look for more from us about the future of Spatial IT and how we can help you get there.

New GeoServer community on Google+

By Mon, May 13 2013 at 04:07AM EDT From GeoServer Blog

Being social and sharing with others is one of the keystones of a open source community.

Traditionally open source communities thrive on mailing lists and IRC channels, however it’s not news that many people prefer other medias for sharing thoughts, experiences, and asking for help. For example, people with an interest in GeoServer are already active on Twitter and StackExchange.

Simone recently created a new Google+ GeoServer community, adding one more choice to the mix: https://plus.google.com/communities/101905665894825745986

If you like mailing lists do not worry, the GeoServer users mailing list will still be the primary and official mean to get support (and we very much suggest you to hang there too), the Google+ community is just an alternate mean of communication that some of us might want to try out.

So, if have an interest in GeoServer and your preferred media is Google+ hop on, if we see there is enough interest we might start doing hangouts and public presentations to leverage this social platform extra features.

GeoServer training in Milan, 6/7 June 2013

By Fri, May 03 2013 at 08:17AM EDT From GeoServer Blog

If you are looking for GeoServer training, save the date: GeoSolutions will be providing a two days long introduction to GeoServer 6/7 June 2013 in Milan.

The training will be held in Italian by GeoServer core contributors, and will cover a basic introduction to GeoServer, setting up vector and raster data, basic and advanced SLD styling, raster data tuning, integration with Google Earth, tile caching with the integrated GeoWebCache, delivering and filtering vector data with WFS, security, integration with the rest interface and setting up for production. The experience will be hands on, with one computer per participant, with discounts for university students.

If you are interested but cannot make it, or would like to get training in another language, go have a look at the GeoServer training page to find more training opportunities.

GeoServer in a clustered configuration (part 2)

By Tue, Apr 30 2013 at 05:04PM EDT From OpenGeo

In our last post on clustering, we talked about the theory behind some different options for clustering. In this post, we’ll go into an example of clustering, taken from our recent experience with one of our OpenGeo Suite Enterprise clients. If you’ll be attending FOSS4G-NA and want to learn more about clustering and GeoServer consider attending our GeoServer training and Juan Marin’s GeoServer in Production presentation (scheduled for 5/23/2013 at 11:30 am).

Clustering Scenario

In this following scenario, we will work through the installation and configuration of two GeoServers each inside their own servlet container instances on the same machine. Each servlet container will use the same JRE and the same container binaries (Apache Tomcat 7), but they will have independent configurations that allow them to run on different ports. These two GeoServer/Tomcat instances will be fronted by a local software proxy called HAProxy which acts as a HTTP/TCP load balancer. Load balancer configurations provide very basic “round robin” balancing of GeoServers. More sophisticated load-balancing configurations are possible, but are beyond the scope of this example. All GeoServers will be deployed as WAR files placed into each of the Tomcat webapps directories. It is possible to have multiple instances of Tomcat share a single web-application through the use of contexts. This is useful if you anticipate your web-application (GeoServer) will be changed/updated frequently, but isn’t necessary.

Implementation

The following steps will walk through the installation and configuration of a basic cluster containing two GeoServers in separate Tomcat servlet containers, behind an HAProxy load-balancer/proxy on the same machine (high-performance). The steps are:

  1. Download and unpack Tomcat binaries
  2. Create individual Tomcat instances
  3. Start Tomcat instances and deploy GeoServer applications
  4. Install and configure load balancer / proxy server

Following this walk-through, we will discuss other options for and extensions of this configuration such as high-availability, alternate proxy tools/configurations, database-backed catalogs, and triggered configuration reloads.

Download and unpack Tomcat binaries

  1. Start by downloading a binary distribution of Tomcat to your machine. This example uses the latest version (7.0.39 at the time of writing) as a .tar.gz file from http://tomcat.apache.org/download-70.cgi.
  2. Unpack this archive to a suitable location: In this example the entire contents of the archive are extracted to /var/tomcat. By convention, we’ll refer to this directory as $CATALINA_HOME.

Create individual Tomcat instances

Next we’ll make two directories for two separate instances of Tomcat to run from. Both instances will use the same Tomcat binaries from the directory created above; the directories created here will just hold the logs and configuration for each of our instances. In this example we’ll make two instance directories, /var/tomcat1 and /var/tomcat2. These will be referred to as $CATALINA_BASE directories. Each of these directories needs a basic structure and some initial content to host a Tomcat instance. Run the following commands (or a variation thereof) to create the basic directory structure.

# mkdir /var/tomcat1/conf
# mkdir /var/tomcat1/logs
# mkdir /var/tomcat1/temp
# mkdir /var/tomcat1/webapps
# mkdir /var/tomcat1/work
# mkdir /var/tomcat2/conf
# mkdir /var/tomcat2/logs
# mkdir /var/tomcat2/temp
# mkdir /var/tomcat2/webapps
# mkdir /var/tomcat2/work

Each Tomcat instance needs two configuration files to start with that define how the service will run (name, host(s), and ports). Copy the files server.xml and web.xml from the $CATALINA_HOME/conf directory into each of the $CATALINA_BASE/conf directories. We’ll need to edit each of the server.xml files so that each Tomcat instance runs and shuts down on different ports. In each file, look for lines like:

<Server port="8005" shutdown="SHUTDOWN">

<Connector port="8080" protocol = "HTTP/1.1"
 connectionTimeout="20000"
 redirectPort="8443" />

Change these port values. The values can typically be any unused port above 1024. Both the SHUTDOWN and the HTTP Connector ports must be different, and the values for these ports in tomcat1/conf/server.xml must be different than those in tomcat2/conf/server.xml. For example:

  • On tomcat1: SHUTDOWN on port 8005, serve HTTP requests on 8085
  • On tomcat2: SHUTDOWN on port 8006, serve HTTP requests on 8086

Start Tomcat instances and deploy GeoServer applications

Once configured, we’re ready to start up the servlet containers. We can do this by running a series of commands from the terminal, however it might be more pragmatic to write a small script to accomplish this since these steps often need to be repeated. Create two scripts: /var/tomcat1/tomcat1.sh and /var/tomcat2/tomcat2.sh. These scripts will be identical except for the value of the $CATALINA_BASE variable.

export CATALINA_HOME=/var/apache
export CATALINA_BASE=/var/apache1
export CATALINA_TMPDIR=$CATALINA_BASE/temp
export JRE_HOME=/usr/lib/jvm/java-6-openjre/jre
export CLASSPATH=/var/tomcat/bin/bootstrap.jar;/var/tomcat/bin/tomcat-juli.jar

$CATALINA_HOME/bin/catalina.sh start

These scripts will perform the following functions:

  • Define the location of the Tomcat binaries in $CATALINA_HOME
  • Set the location for the current container in $CATALINA_BASE and $CATALINA_TMPDIR
  • Define the location of the JRE for Tomcat to use (we’re assuming Java is installed)
  • Define the CLASSPATH to the core Tomcat JARs
  • Export each environment variable so they are available to the Tomcat start-up calls
  • Run the Catalina control script to start the Tomcat service

With the files created, run both /var/tomcat1/tomcat1.sh and /var/tomcat2/tomcat2.sh to configure and start the two services. Note that you’ll need to make the files executable. To confirm that the services started correctly, you can tail the catalina.out files in /var/tomcat1/logs and /var/tomcat2/logs. Next, copy the geoserver.war file (or unpacked application directory) into each of the /var/tomcat1/webapps and /var/tomcat2/webapps directories. The applications should automatically deploy. Confirm that you have two GeoServers running independently by browsing to each one on their respective port.

Two GeoServers

Two GeoServers

Install and configure load balancer / proxy server

Install a web server that is capable of acting as a load balancer in front of the cluster. In this example we use HAProxy installed on Ubuntu Linux using standard package management. That said, rhere are many other tools you can use as a front-end load balancer / proxy server other than HAProxy. One example would be Apache HTTP with mod_proxy_http and mod_proxy_balancer. Microsoft IIS can also sit in front of Tomcat instances using the Network Load Balancer (NLB) or Application Request Routing (ARR). Configure haproxy to act as a load balancer using the following sample in /etc/haproxy/haproxy.cfg. Stop haproxy, copy / edit the config file in place, and then restart.

global
    maxconn 4096
    user haproxy
    group haproxy
    daemon

defaults
    log global
    mode http
    option httplog
    option dontlogpull
    retries 3
    option redispatch
    maxconn 2000
    contimeout 5000
    clitimeout 50000
    srvtimeout 50000
    log 127.0.0.1 local0
    log 127.0.0.1 local7 debug

frontend http-in
    bind *:80
    default_backend geoserver

backend GeoServer
    balance roundrobin
    server GeoServer1 localhost:8085 maxconn 32 check
    server GeoServer2 localhost:8086 maxconn 32 check
    ##option httpchk
    ##option forwardfor

listen admin
    bind *:8080
    stats enable

(Note that these are very basic configurations. Your system administrator will have a better idea of what is the norm for your organization.) There are a few ways you can confirm that HAProxy is working and balancing the clustered back-ends as anticipated. 1) Browse to the HAProxy admin page at http://<server>:8080/haproxy?stats:

HAProxy stats

HAProxy stats

In the figure above we see that HAProxy recognizes the GeoServer backend with two members GeoServer1 and GeoServer2. 2) You might also make a noticeable configuration to one or GeoSevers and observe that the single proxy URL is requesting data from all members of the backend. For example, a change to a single SLD file, and then requesting a layer that makes use of that style through HAProxy confirms that both of our GeoServers are handling requests.

Conflicting styles from two different GeoServers through HAProxy

Conflicting styles from two different GeoServers through HAProxy

A Common Data Directory

Normally, GeoServer data directories in a cluster will be identical in content. In this example we will set a common data directory for all cluster members using a context parameter in the web.xml file of each GeoServer in our cluster. You can specify the GeoServer data directory location several other ways. Make a new directory for your shared GeoServer Data Directory.

# mkdir /geoserver_data

Copy a template data directory into the new location. This step is optional. As long as the base data directory exists, GeoServer will create the basic configurations it needs if they’re not found on start-up. It’s just a bit more painless for this example if they are where we expect them.

# cp -r /var/tomcat1/webapps/geoserver/data* /geoserver_data

Stop the two GeoServer / Tomcat instances so we can reconfigure our data directory locations, either by killing the identified pids for the java processes that our GeoServers are running under, or (in a more sophisticated installation) using a service. Update the web.xml file in each web application WEB-INF directory. This file typically lives at $CATALINA_HOME/webapps/geoserver/WEB-INF/web.xml. Make four changes to each file:

  • Specify the new location of the GEOSERVER_DATA_DIR
  • Specify a GEOSERVER_LOG_LOCATION for this particular instance to log to. This will avoid collisions with the other GeoServer nodes in the cluster writing to the same location. For example, set to /geoserver_data/logs/geoserver_tomcat1.log
  • Set GWC_DISKQUOTA_DISABLED to true. This will avoid collisions with the other GeoServer nodes’ GWCs writing disk use information to common locations.
  • Set GWC_METASTORE_DISABLED to true. This will avoid collisions with the other GeoServer nodes’ GWCs writing cache status information to common locations.

Extending These Examples

This document is intended as just an example of setting up GeoServer in a clustered configuration, designed to move users towards a more scalable GeoServer installation, but might not be suitable for production in all environments. Future versions and alternate scenarios will take into consideration:

  • Scaling GeoServer horizontally (on multiple machines)
  • Hybrids of vertically and horizontally scaled instances
  • Better ways to manage and configure multiple servlet containers (Tomcat) and web applications (GeoServer)

What sort of GeoServer clustering environments are you interested in setting up? Let us know in the comments below.

Alpha releases

By Wed, Apr 24 2013 at 03:47PM EDT From OpenGeo

openlayers3

One thing I love about open source development is the ‘alpha’ release.

Last week was an exciting week of alphas for OpenGeo, both OpenLayers 3.0 and GeoGit had their first releases and launched new websites. The two websites are admittedly not very sophisticated—I made the geogit.org with GitHub’s page generator and Andreas pulled together ol3js.org with Bootstrap—but awesome websites can come later. The point of these alpha releases is to get something out in the world and widen the open source process to new users and potential contributors.

Alpha releases are rarely seen in proprietary software development since software in an alpha state is generally quite buggy. To quote Wikipedia: ”alpha software can be unstable and could cause crashes or data loss.” At this point many would turn away and run as far from the software as possible but to me it’s an awesome thing, an understood pact between the developers and the users that says: “hey, we’re not perfect, and we know our software is far from perfect, but if you understand the risks we’d be really excited to show it to you.”

The process opens up a dialog of equals—not the typical consumer relationship, but a collaborative one. The user of alpha software actually has a responsibility to communicate when (not if) things go wrong and to tell the developers how it crashes, what important option isn’t there, how the installation fails, or even how the website is confusing. In this way, responsibility can grow from being an alpha user to include helping with documentation, improving the website, debugging problems, contributing patches, and eventually building major new features as a core developer. Indeed the point of the alpha release is to put a stake in the ground and open the process to gain feedback from others, allowing users and developers to build the future together. Everyone is expected to be a true participant, in the fullest sense of the word, with responsibilities as well as privileges as opposed to just a passive consumer.

We encourage you to check out both the OpenLayers 3.0 and GeoGit alpha releases and let the teams know what you think. OpenLayers in particular has a very solid core but is looking for practical input from real users. We think the projects show a lot of potential, and we’re excited for your feedback, encouragement, and even contributions. Don’t hesitate to jump in and join us as we build the geospatial future together.

GeoServer 2.3.1 released

By Tue, Apr 23 2013 at 01:15PM EDT From GeoServer Blog

The GeoServer team is happy to announce the release of GeoServer 2.3.1, now available for download.

This is the first bug-fix release of the 2.3.x stable series, and packs 34 changes between bug fixes and improvements in different departments. Here is an highlight:

  • The XSLT output format generator module graduated from community to official extension (work performed GeoSolutions Andrea Aime with sponsorship from the City of Wien)
  • A couple of changes in the SQL Server data increase performance, provided you enable the relevant flags. One is the ability to transfer geometries in SQL Server native format, as opposed to using WKB, which avoids the slow WKB generation routines included in SQL Server, the other allows to disable native paging, which can sometimes lead the SQL Server query planner to create bad data access plans (GEOS-5750 and GEOS-5314). The native geometry serialization was sponsored by Norwegian Directorate for Nature Management and performed by Bouvet, thanks to Stewart Loving-Gibbard for the paging patch.
  • A new flag has been added to prevent the resolution of XML external entities (normally enabled by default in XML parsers), which can be used to prevent some kinds of XML attacks (GEOS-5273 and GEOS-5314). The work was performed by GeoSolution under sponsorship from the City of Wien.
  • Some SLD related fixes (GEOS-4214GEOS-5767) (thanks to GeoSolutions’s Andrea and Carlo for those)
  • A bunch of improvements in the monitoring extension (GEOS-5725GEOS-5732GEOS-5758 and GEOS-5766) (thanks to OpenGeo’s Justin Deoliveria and Ian Shneider for those, plus one from Andrea)
  • Fixed an issue that prevented GeoServer from serving back GetMap requests under Windows when using the JRE (GEOS-5768) (thanks goes to GeoSolutions’ Simone for this one)
  • GWC will now respect the layer HTTP caching settings, and a memory leak has been fixed in respect to the WFS-T integration (GEOS-5686GEOS-5659)  (thanks’ to OpenGeo’s Ian Shneider for these)
  • The German translation maintainers fixed an encoding problem which resulted in weird chars appearing in the GeoServer UI (GEOS-5641) (thanks goes to Frank Gasdorf and Oskar Fonts to his continuous work on UI internationalization)
  • GeoServer UI is completely available in Dutch, German and Korean (thanks goes to Wouter van Nifterick, Frank Gasdorf, Stefan Engelhardt, Minpa Lee and others), feel free to review and contribute at Transifex.com
  • Finally, a few fixes and improvements in the security subsystem (GEOS-5698GEOS-5751GEOS-5753GEOS-5783) (thanks to Christian Mueller for his continuous work on the security subsystem)

And there is more, you can find everything in the GeoServer release notes. Also, looking at the corresponding GeoTools 9.1 release notes we can find some extra goodies:

  • Improved support for sparse shapefiles (GEOT-2791) (thanks to Dieter DePaepe)
  • Added support for UUID data type in PostGis data stores (GEOT-4414) (thanks to Shane StClair)

Download GeoServer 2.3.1, try it out, and provide feedback on the GeoServer mailing list.

Thanks again for using GeoServer!

 

GeoServer in a clustered configuration (part 1)

By Thu, Apr 18 2013 at 11:26AM EDT From OpenGeo

Recently, we helped one of our clients who wanted to set up a GeoServer cluster. There are different ways to accomplish clustering depending on your specific needs, but we thought it would be illustrative to show what we did in this particular situation. Keep in mind this is a specific treatment and fairly tailored. We encourage you all to experiment with the newest features, but remember to do so in your testing environment!

We’ll start with some clustering theory and tips before launching into the actual details of how to do it.

Background

A computing cluster consists of two or more machines working together to provide a higher level of availability, reliability, and scalability than can be obtained from a single node. Nodes in a cluster are positioned behind a proxy server and/or load balancer that delegates requests to cluster members based on any one member’s ability/availability to handle load.

Clustering

Clustering

Similar to other applications with long-running in-memory states and high data I/O, GeoServer sees performance gains with two (or more) nodes clustered behind a load balancer—even with the slight overhead of the load balancer that sits in front of the cluster.

Generally, there are two complementary purposes for clustering GeoServer:

  • To provide high-performance and/or throughput
  • To achieve high availability

In the most demanding situations, GeoServer can be deployed in combinations of high-performance and high-availability instances.

High-Performance Clusters

A high-performance GeoServer configuration deploys several instances of GeoServer on a single machine.

High-performance cluster

High-performance cluster

Each GeoServer instance is deployed into its own servlet container (Tomcat, Jetty, etc.). Individual servlet containers are configured independently and spin up their own JVM, each with it’s own memory and processor allocations (borrowed from the pool of resources on the host machine). GeoServer’s memory and CPU runtime footprint are optimized for high throughput under heavy concurrency with such a deployment, but always consider that these different deployed units will compete for the physical server’s resources. To find the best balance we recommend, as always, to test for your particular scenario.

A load balancer or proxy fronts the cluster, and directs traffic to the member of the cluster most able to handle the current request. In this case, nodes will likely share the same server name or IP address, but listen for requests on different ports. For example:

Load Balancer @ http://<server>:80/<alias> forwards to one of:

  • GeoServer 1 in Tomcat 1 @ http://<server>/geoserver:8081
  • GeoServer 2 in Tomcat 2 @ http://<server>/geoserver:8082
  • GeoServer 3 in Tomcat 3 @ http://<server>/geoserver:8083
  • GeoServer 4 in Tomcat 4 @ http://<server>/geoserver:8084

An approach that deploys multiple instances of GeoServer into the same servlet container is not recommended. In this case, since host resource allocation (to a common JVM) will not be sequestered as neatly, competition for those resources will occur, limiting the benefits.

Users might also consider using the built-in clustering capabilities found in Enterprise Application Servers (such as Oracle Weblogic or JBoss), however this is beyond the scope of this discussion.

High-Availability Clusters

A high-availability implementation will spread several GeoServer instances across several machines (nodes) in a cluster. These nodes can be physical or virtual machines.

High-availability cluster

High-availability cluster

Nodes are normally located behind a load balancer that redirects traffic to any single GeoServer based on traffic volume and availability. In this case, nodes will likely be on different servers or IP addresses and listen for requests on the same port. For example:

Load Balancer @ http://<server>:80/<alias> forwards to one of:

  • GeoServer 1 in Tomcat 1 @ http://<server1>/geoserver:8080
  • GeoServer 1 in Tomcat 1 @ http://<server2>/geoserver:8080
  • GeoServer 1 in Tomcat 1 @ http://<server3>/geoserver:8080

Data directory location and catalog reloads

Some important considerations to be made when clustering several instances of GeoServer concern the location of the GeoServer data directory and a strategy for reloading all cluster members’ data catalogs.

The GeoServer data directory is the location in the file system where GeoServer stores its configuration information. The configuration defines things such as what data is served by GeoServer, where it is stored, and how services such as WFS and WMS interact with and serve the data. The data directory also contains a number of support files used by GeoServer for various purposes.

The spatial data accessed by GeoServer doesn’t need to reside within the GeoServer data directory, just pointers to the data locations. This should be obvious for data stored in spatial databases, which are certainly in different locations (on disk) and often on different machines; however the same is true for file-based spatial data. (Read more about the GeoServer data directory.)

GeoServer’s catalog is an in-memory representation of the configurations in the data directory. Storing the configurations in memory means that GeoServer can access this information faster than by reading these instructions off disk. However, this sometimes requires that the in-memory catalog be refreshed when configurations changes are made to the disk-based GeoServer data directory, or to the actual data served in GeoServer.

Unless catalog (re)configurations are largely static, or some amount of catalog discrepancy or availability is acceptable, a common GeoServer data directory location for all clustered instances is highly recommended.

The location of the GeoServer data directory is stored in the GEOSERVER_DATA_DIR variable. It can be configured in one of three ways: in each instance’s web.xml file (/webapps/geoserver/WEB-INF), through a common environment variable, or through a parameter passed to the JVM in the container start-up command.

Some implementations have clustered GeoServer instances using separate data directories that are synchronized manually (low change frequency) and automatically (using rsync), but neither approach is as common or recommended as a shared data directory.

Regardless of the mechanism for synchronization, changes to the data directory and the in-memory catalog will normally be directed by one master GeoServer. This can be enforced by disabling the GeoServer user interface on all “slave” GeoServers or by configuring the front-end load balancer to only direct user interface requests to /geoserver/web to the master GeoServer.

Changes to the master GeoServer’s data catalog must be explicitly refreshed on slave instances. This can be accomplished manually through the GeoServer Admin web UI (/geoserver/web), or with some measure of automation (on a schedule, or after a trigger is fired) using GeoServer’s REST API (e.g. by sending a POST/PUT request to /geoserver/rest/reload?recurse=true).

Clustering Enhancements

Enhancements to our clustering story are coming! Specifically, in future releases of GeoServer. the data directory will have the option to be database-backed—This means that a central configuration store can be queried more optimistically than a file-based counterpart, and doesn’t all need to be read into memory.

In the next post, we’ll go into the details on setting up a clustered instance. Remember, Enterprise: Platform clients and higher get custom clustering and deployment advice included in their maintenance agreements.

Have you been looking at deploying GeoServer in a clustered environment? Tell us about it!

Why We Sprint

By Wed, Apr 03 2013 at 12:42PM EDT From OpenGeo

I spent last week in Boston, attending an annual code sprint for C-based open source geospatial projection.  I’ve been doing this every year since 2008.  Since getting back, I’ve had to explain the event to several people, technical and non-technical, since the concept isn’t obvious at all.

p3

Open source development of characterized by some features that differ a great deal from traditional work environments:

  • the developers work asynchronously, often in different time zones, usually in different locations,
  • the developers coordinate exclusively using text tools, like e-mail, issue tracking systems, and sometimes instant messaging

Because there is no need to be in the same space with other developers, either physically or even temporally, the barriers to entry to a project are lowered. More people can participate than otherwise.

p1

However, there are disadvantages to working asynchronously and with text communications.

  • asking for help when you get stuck can be time consuming, because your colleagues might be asleep at the moment when help would be most useful
  • issues of subtlety or complexity take a great deal of text to describe, and any misunderstandings on the part of a reader take even more text to correct
  • discussion of emotional issues can lead to conflict due to the limited emotional nuance in text communication

A code sprint is a chance to work for a time with your open source colleagues “the old fashioned way”, face to face, on the same clock.

p2

Because everyone is together, and communications are high-bandwidth and high-fidelity, a code sprint is a great time for:

  • planning and designing large scale changes to the code
  • designing new APIs or new user interfaces, and
  • triaging ticket lists to prepare for release

I usually spend the first half of a sprint on communication-heavy tasks like the ones above. The second half I usually spend heads down on a hard piece of code.

If the right experts are around, code sprints are an excellent time to attack a new piece of code you don’t quite understand. Learning how a module works from the expert who wrote it is far faster than doing it alone at home.

And finally, having lunch and dinner and socializing usually provide the social space for unexpected topics to slip out and get a discussion, whether they be uncomfortable issues like dealing with a difficult team member or just a crazy feature idea that turns out to be not so crazy at all when discussed with the group.

If you have a chance to participate in a code sprint on a project you contribute to, don’t pass it up!

More