Dwins’s Weblog


Workaround: SBT/Maven “Inconsistent module descriptor”

Posted in Development,Open Source Software by dwins on October 15, 2011
Tags: , , ,

Here’s an issue I’ve been dealing with in GeoScript.scala for a while: when fetching dependencies I get an error message:

update: sbt.ResolveException: unresolved dependency: xml-apis#xml-apis-xerces;2.7.1: java.text.ParseException: inconsistent module descriptor file found in ‘http://download.osgeo.org/webdav/geotools/xml-apis/xml-apis-xerces/2.7.1/xml-apis-xerces-2.7.1.pom’: bad module name: expected=’xml-apis-xerces’ found=’xml-apis’; bad revision: expected=’2.7.1′ found=’xerces-2.7.1′;

Sure enough, when I go to http://download.osgeo.org/webdav/geotools/xml-apis/xml-apis-xerces/2.7.1/xml-apis-xerces-2.7.1.pom I can see that even though the URL suggests the artifact would be (group: xml-apis, artifact: xml-apis-xerces, version: 2.7.1), it is in fact listed as (group: xml-apis, artifact: xml-apis, version: xerces-2.7.1).  Apparently Maven doesn’t verify those details when fetching dependencies, but Ivy (by default) does.  According to this JIRA issue I found there is an option to disable it, but SBT doesn’t seem to expose it (I fiddled around a bit with an ivysettings.xml file and the ivyXML setting but to no avail.)

Finally I just added an entry to my libraryDependencies setting like this:

“xml-apis” % “xml-apis-xerces” % “2.7.1” from “http://download.osgeo.org/webdav/xml-apis/xml-apis-xerces/2.7.1/xml-apis-xerces-2.7.1.jar

This doesn’t avoid the warning but does let me go ahead with my build.  Good enough I guess.

But recently I’ve been hearing some complaints from folks checking out my code (that’s right, potential contributors!) who’ve been confused by the error.  Since first encountering this issue I’ve been informed that xerces is actually not even needed, the API it implements is included in modern JVMs or something, so I tried just adding a dependency exclusion to my SBT build file.  Success! Not only was I able to run the ‘update’ command with no scary errors, but the test suite even runs.  Ticket closed.

Except, as it turns out, Ivy needs that metadata to handle the exclusion (or something, I didn’t dig in too deep.)  I already had it in a local cache, thanks to having run the build with that explicit URL, but when benqua from Github attempted a build from scratch he ran into the same issues again.  ARRGH.

Finally I ended up sticking with the exclusion, but now the GeoScript build includes a dummy subproject that has the xml-apis dependency with the explicit URL.  The intermodule dependencies are set up so that it always runs before the main “library” subproject, meaning that the xml-api-xerces project, with correct metadata, is in the local cache before any dependency resolution involving GeoTools begins.  Not incredibly elegant, but it does work.

I’ll be in touch with the GeoTools guys to see about just fixing the metadata in the repository – since Maven appears to ignore it I don’t see this breaking any existing builds.

Advertisements

GeoServer CSS – Conversion

Posted in Cool Stuff,Development,Open Source Software by dwins on July 25, 2010

When I first posted about GeoServer CSS, I was planning to follow up more frequently, but over the past couple of weekends I’ve been distracted by implementing some new features (integration with the Scala variant of GeoScriptstyled marks) as well as getting  away from the keyboard a bit.  This past week, however, I’ve been working on some speed improvements and I was thinking it would be nice to blog about those. Unfortunately, I haven’t explained what the conversion process is doing in the first place, so any discussion of performance would be a bit premature.  I suppose that means I should start at the beginning.

Last time I blogged about this, I said a bit about the parser that takes in a CSS file and breaks it up into objects that the Scala code inside GeoServer CSS can manipulate more easily.  The conversion process I’m talking about today disassembles and reassembles those rules into style components from the GeoTools API, and then you have a style that you can pass to GeoTools’s style serializer or straight to the renderer or whatever you like. Or, more graphically (ASCII art, woo):

 [CSS File] -- parser --> [CSS Objects] -- translator --> [SLD Objects]

The parser isn’t incredibly fast, but the real bottleneck is the translator.  Part of the point of the CSS converter in the first place is that CSS uses a very different model than SLD for combining rules: SLD uses the so-called Painter’s model, where each rule is applied in its entirety to any feature that matches its filter, and if multiple style rules apply to the same feature then they are drawn on top of each other.  CSS on the other hand, allows rules to override properties from other, less specific rules. The way I came up with to deal with this impedance mismatch was to inspect the rules, and produce one SLD rule for each possible combination of CSS rules.  For example, starting with this CSS source:

[a > 1] {
   fill: blue;
} 
[b < 2] {
   stroke: yellow;
}

We produce 3 SLD rules – one for the case where [a > 1 AND NOT b <2], one for [a > 1 AND B < 2], and one for [NOT A > 1 AND B < 2].  The fourth combination, with both rules negated, is a valid combination, but we throw it out since there are no styling attributes for that combination.  In general, there are (2^n – 1) potential SLD rules for a stylesheet containing n CSS rules.  That’s a lot; for only 10 input rules the output stylesheet could contain over 1000 rules.  I say “could” because there are some optimizations that can help to avoid enumerating all these possibilities… but as I said before, I’ll leave that for another post. After combining all these style rules, we still need to re-encode them as SLD.  Even this step is not straightforward as the CSS module unifies certain “contextual” filters with the SLD Filter concept, and also uses z-indexing instead of SLD’s “stack of featuretypes” model for controlling rendering order.

So, when a CSS rule combination is applicable to multiple featuretypes, or is applicable in multiple zoom ranges, or has symbolizers at multiple z-indexes, the equivalent SLD style will have multiple Rules corresponding to that single combination of CSS Rules.  So, we have to inspect the selectors to figure out the right set of SLD rules to produce.  In pseudocode:

featuretypes = extract_featuretypes(rule.selectors)
symbolizers_per_z_index = split_by_zindex(rule.symbolizers)
scale_ranges = extract_scale_ranges(rule.selectors)
for (ft in featuretypes)
    for (z, symbolizers in symbolizers_per_z_index)
        ftstyle = create_featuretypestyle(ft)
        for (range in scale_ranges)
             ftstyle += create_sld_rule(range, rule.selectors, symbolizers)
    output += ftstyle

Then, on top of that, we need to keep the total number of ftstyles produced as low as possible since each one requires an extra buffer at render time.  The code in Translator.css2sld does something similar, but shuffling all the rule combinations into one SLD style as it goes. So, clearly a lot going on here.  Sometime soon I’ll talk a bit about some of the things I’ve done to keep the conversion time under control (and for power users, how to make your styles friendly to the converter).

A Code Smell – the Utilities Trait

Posted in Development by dwins on May 7, 2010

warning: if you’re not a programmer you should probably stop reading now (use the time you’d have spent here at MS Paint Adventures; you’ll be glad you did.)

Back in school I saw a design pattern used in some students’ projects (including my own) where someone would want to have some constants used in several places in their code.  In Python, you would handle this with a simple variable in your module:

SNOWBALLS_CHANCE=6.8e-35
RABBITS_FOOT_INFLUENCE=7.5

A C++ programmer would probably use the preprocessor to define some constants that get replaced with literals before the compiler even sees the code:

#define SNOWBALLS_CHANCE 6.8e-35
#define RABBITS_FOOT_INFLUENCE 7.5

These projects were in Java, however, and Java doesn’t have free-standing values, or even a preprocessor like in C++ does; every single thing has to be part of some class.  Values that aren’t actually associated with class instances require the static modifier to stand apart from objects.  Code referencing such constants needs to qualify them with the name of the class they belong to, unless they subclass it:

class LuckConstants {
    private LuckConstants() { 
        /* don't even THINK of instantiating this, guys */ 
    }
    public static final float SNOWBALLS_CHANCE = 6.8e-35;
    public static final float RABBITS_FOOT_INFLUENCE = 7.5;

}
class PokerPlayer {
    public static void main(String[] args) {
        System.out.println("Odds:" + LuckConstants.SNOWBALLS_CHANCE);
    }
}
class PokerDealer extends LuckConstants {
    public static void main(String[] args) {
        System.out.println("Odds (adjusted):" + RABBITS_FOOT_INFLUENCE);
    }
}

This is kind of a lot of code, and if I want to have PokerDealer inherit from some GameMaster parent class I am going to have to use the long form since Java doesn’t do multiple inheritance.  Fortunately, there is also the option of inheriting constants from an interface, which doesn’t add any restrictions to the other parent types of a class.  The IConstants interface was the most common variation I saw, and I’m not aware of any particular shortcomings with this approach.  Still, it bugged me a bit that this setup involved this kind of useless type.  What use is an interface with no methods?  So I was really glad to find out about import static, which lets Java code reference static members of classes without qualifying them and without adding extra cruft to the type hierarchy.  (I started doing Java programming just before this feature came out, and I wasn’t able to use it for a little while as Apple dragged their feet a bit bringing it to the JVM that they provide for Macs.)

Okay, fast-forward 5 or 6 years.

Imagine my chagrine when, a couple of months into my first serious Scala project, I noticed that I had written some Utility Traits that did much the same things as those constant container interfaces, but providing some methods instead of some constants.  In Java such a thing isn’t possible since interfaces can’t contain implementation, but Scala’s traits can.  (Otherwise they are analogous to interfaces, and even reduce to Java interfaces when they don’t include implementation.)  Scala definitely has a nice equivalent to static class members that would be totally applicable here, so why didn’t I use that? Lame.

So when you find a trait in Scala code that doesn’t have any abstract methods, and maintains no state, it’s definitely time to consider refactoring it to an object instead.  (If it does have state, but no abstract methods, maybe it should be a class instead of a trait.)  You can easily convert any client code to the new way.  Let’s say you have a Utilities trait for doing some housework:

trait Utilities {
    def unclogDrain() {}
    def cleanGutters() {}
    def retileRoof() {}

}

class HiredHand extends Utilities {
    def doWhatYourePayedFor() {
        unclogDrain()
        cleanGutters()
        retileRoof()
    }
}

You could make it a Utilities object instead with:

object Utilities { ... }
class HiredHand {
    import Utilities._  
    // now everything from Utilities is in local scope, woo!
    ...
}

Neat!  Now to go fix up that code.

No Crystal Ball

Posted in Development,Open Source Software by dwins on January 12, 2009

The OpenGeo team recently created a new, more formal group for JavaScript developers (aka the ‘jteam’) Starting this week, I was supposed to be dividing my time 3:2 between GeoServer work and jteam tasks.

The manager is dealing with some personal obligations and that first week on the new schedule was pushed back a week.

Over the winter break a neat styling tool for GeoServer was announced that made use of a GeoServer extension I’ve been working on on and off for the past 9 months or so. Since then it’s been getting a fair bit of attention from the community since then, I figured I’d be putting a lot of work into polishing it up so we could make it an official extension (basically, put a link to it on the downloads page.)

I ended up fixing random bugs against GeoServer while another developer reviewed the module.

Of those bugs, this one sounded like it would be pretty straightforward to fix. Another sounded pretty tough.

The first took me two days to fix, the second one I resolved in an afternoon.

I am beginning to think that I am not very good at predicting the future.

Happy 2009

Posted in Development,Ideas by dwins on January 5, 2009

Hey, looks like another new year is upon us (I know I missed it by a few days, but give me a break as I’ve been on vacation for a couple of weeks and my brain is still kicking back into gear.) I don’t usually put too much stock in coming up with resolutions for the new year, but this time around I think I’ll make an exception.  My resolution: complain more, but only complain to the right people.

Recently at work I’ve noticed I’m developing a bad habit of, when I have a problem with the way things are being done, complaining to everyone except the person responsible, whether because I think it’s too minor an issue to debate or the culprit is not online/around when I run into trouble or I feel like decisions have been made over my head or whatever.  While out of the office the past couple of weeks I’ve been thinking that over, and I see two big problems with that approach:

  • complaining about things to others fosters a predisposition for them to find flaws with their own work, and establishes a precedent that makes following suit seem more acceptable
  • not complaining to those responsible means that things won’t get fixed.  Note here that ‘fixed’ might not mean changing what’s done, it could just be giving me that extra bit of perspective that helps me understand why things are being done that way.

These two things feel like a pretty lame combo for a team, so hopefully phasing them out will be a big win.

As long as I’m doing the resolution thing, I think I will also try to post more regularly on this blog.

the right audience

Posted in Development,Ideas by dwins on November 13, 2008

I just woke up from a… wow, 2 hour-long nap.  I wasn’t planning to take a nap; I was trying to read a book about JavaScript, a programming language which interests me only due to the fact that I might eventually have some use for it.  (Not that that’s a bad reason to learn a language; there’s only one language out there that I’ve learned for a reason other than that I thought I might need to get stuff done in it; and I even take a certain glee in the raw mindless-effort-reduction capacity of some languages, a la bash:

15:56 < bmmpxf> dwins: iwilling and I are still having trouble getting
 the data in the postgis.  please stand by.
15:57 < dwins> bmmpxf: what sort of trouble? should I lend a hand?
15:59 < bmmpxf> dwins: Just trying to avoid typing in the password lots of times
16:00 < dwins> bmmpxf: parens to the rescue
16:00 < dwins> (for file in *.shp; do shp2pgsql $file; done) | psql
16:01 < dwins> man I love punctuation

Still, JavaScript the language doesn’t interest me half so much as JavaScript the platform for web application development.)

So what happened?  Was I in the wrong frame of mind (is there some sort of reading flow I should get into?)  Is my brain too feeble to handle more than 37 pages of JavaScript’s splendor without needing to recharge?  Is a book a bad way for a programmer to become acquainted with a language?  Did I miss my daily dose of wake-up pills this morning?

I think it’s just that I picked a book that was written for non-developers (from the preface: “We’re geeky, so you don’t have to be!”) and so dumbed things down a bit much.  I didn’t immediately put it down because, hey, I want to be able to explain things to non-developers too!  But after spending forty pages with half-a-dozen sidebar notes saying ‘sorry we included Hello World as an example in a programming text’ and ‘html is a thing you can write in any editor, but don’t use Word!’ I was a little overwhelmed.  Why not just skim, you ask?  This particular book has so few words per page that I found skimming pretty frustrating, it just didn’t work for me.

So, moral of the story: if you read something that’s written for a target audience that clearly doesn’t include you, you’re probably going to feel like you’re going against the flow.  Similarly, if you’re writing a thing (as the OpenGeo team is right now with a serious reworking of the GeoServer documentation) you’re throwing away a lot of your effort if you don’t have the right audience in mind.

Java(Script)?

Posted in Bio,Development by dwins on October 20, 2008

Recently my manager at OpenGeo, Chris Holmes, asked me about working on some JavaScript projects.  So far at OpenGeo I’ve been using Java pretty exclusively (I’ve spent a bit of time patching up a couple of things in Python), so JavaScript would be a pretty serious change from my normal routine.  Java is a compiled, statically typed, strongly typed language with one runtime that dominates the market (or at least, where we can specify that one particular runtime is supported by the project); JavaScript is interpreted, dynamically typed, weakly typed, and basically has as many interpreters as there are browsers out there, all of which have their own deviations from the standard.

Initially, I told Chris I’d be up for a switch of language, but when I asked some of the guys who are already doing JavaScript they told me it’d be smart to stay away if I could, because it’s just such a pain to deal with cross-browser development in JavaScript.  I’m not really that concerned about it though; I mean, don’t all languages have their weak points?  I can’t just avoid everything that doesn’t inspire outright fanboyism from its users. (Though, to be fair, there’s plenty of kool-aid drinking in the Java world as well; I just don’t have any handy examples in the form of webcomics.)

Anyway, I don’t have any JavaScript projects lined up just yet (and I do have a fair bit of stuff to do in Java), so I guess I’ll just see how things go.  Thoughts from random folks on the interwebs welcome.

(Aside for non-techies: ‘typing’ is the model by which a programming language structures data.  A strongly typed language forces you to use each value in your code as a single type (ie, is this a number or a word or a record representing a country, etc.), while a weakly typed one will try to guess the types based on context (so you can, for example, add the characters “23” to the number 8 and get 31.  A statically typed system does all this checking before the code ever runs (and usually refuses to run if any of the checks fail) while a dynamically typed system waits until a line is run to check (and will run just fine with code that makes no sense as long as you don’t actually get to that line of code).  Wikipedia probably explains it better.  Anyway, typing is a pretty big factor in a language’s ease-of-use, since it affects how much work the compiler/interpreter can do for you in terms of validating code or figuring things out from context.)

My Very Own Branch

Posted in Development by dwins on October 16, 2008

I’m trying something new this week: working on a slew of new features for GeoServer in a branch of my own.  It’s kind of nice to have my own sandbox, although when I finish with the new stuff I’ll be making a pretty substantial patch to the main GeoServer branch, meaning it will probably require a vote at the weekly GeoServer meeting.  Still, I’ll be able to keep my stuff separated and versioned while waiting on the 1.7.0 release, and I think the new features will be pretty exciting at the end.

What am I working on? A few things are on my plate right now:

  • Make human-attended configuration for regionating a performance tweak rather than a necessity for regionating to work at all
  • Experiment with some alternative ways of expressing the tree (basically, fake the inclusion of all features by drawing the first few in vector form, and including a raster background with the rest)
  • Allow users to set custom templates for the KML popups on a higher level than individual layers.

<plug type=”shameless”>

If you want to help me out, you can check out the work in progress and let me know if you see anything broken in Google Earth (aesthetic opinions welcome as well!)  Just visit http://publicus.opengeo.org/dwins_kml/mapPreview.do and click on any of the KML links, then browse around.  (You’ll need Google Earth, of course; you can grab the installer from http://earth.google.com/) There will probably be another update here with more info about checking the different alternatives once I get the different visualization modes working, so stay tuned.

</plug>

Code Review Review

Posted in Development,Ideas by dwins on October 16, 2008

Mel Chua writes about code review tools in a recent blog post, pondering whether a software code review tool could benefit OLPC (where she’s now employed, doing QA and, knowing her, a zillion other things).  I was about to just make a comment there, but I realized I have a fair bit to say, so full-fledged blog post from me it is.  The only potential benefit she directly mentions is that

there’s this constant loop of feedback and revision happening with code – imagine something going around and around in a positively improving cycle – and when it’s ripe and ready, someone with privs can pluck it from that cycle and make it a commit (as opposed to a linear “go forward… get stuck here… if it doesn’t work discard it out of the waterfall completely” system).

She also links to a video of a Google Tech Talk by Guido van Rossum (of Python fame) about Google’s code review process, along with a tool he wrote to make it easier. (The open-source version can be found here.)  There, he enumerates several benefits of code review:

  1. You can catch bugs before they make it into revision control
  2. Senior developers can impart knowledge to n00bs
  3. Senior developers can verify that said n00bs can be trusted with commit privileges (this is a big one in the open-source world)
  4. Team members can familiarize themselves with each other’s strengths and weaknesses
  5. In general, you get the benefits of pair programming without the scheduling constraints.

A while back we installed a trial version of Atlassian’s Crucible code review tool at OpenGeo to use for GeoServer development.  I personally ended up using it for only two reviews: a (vast) speed improvement on the code that builds regionated hierarchies in GeoServer, and to review a patch submitted by Wayne Fang of Refractions Research.  Maybe this means that constant code review is not well suited to use by smallish groups (I interact with about 4 developers who actually work on GeoServer on a regular basis, and most of the time we work pretty independently on separate parts of the project.  We are geographically dispersed, so often Important Stuff happens while I am asleep.)  Anyway, I’d say that of the benefits Guido mentioned, only the first couple really came into play.  Interestingly enough, the KML regionating work ended up being a more-or-less complete rewrite, including a new algorithm, so familiarity with the code wasn’t that important in the end.  Much of the discussion around the patch from Wayne was related to style and design concerns (does that class name really signify what it’s actually doing?) rather than behavior, so may have actually been better served by the mailing list.

A couple of other random thoughts:

  • Code review is a process.  Including more process means more training needed by people coming onto a project, though this can be mitigated by restricting the complicated bits to senior members, or, ideally, an automated system.  I don’t think it’s a foregone conclusion that adding code review to a project’s policy will make it a better project; it should be considered against the weight in developer ‘activation energy.’  For something like GeoServer, where development is fairly process-heavy anyway, this is pretty marginal (and I’d be interested in seeing where a more serious trial of code review could get us.)  For something with less dependencies, less API to get acquainted with, a quicker test suite, etc, it might be overly burdensome.
  • Guido also mentioned that code review will happen whether you set aside time for it or not since any bugs that make it into the codebase will have to be fixed.  It may be a better strategy to just fix the bugs that are caught in QA, directing developer attention to the most egregious bugs first.  Catching bugs by inspection is not exactly bulletproof.  Of course, if you’re relying on code review as a means of maintaining some minimal level of API design quality, going back after the fact is a bit tougher.  But then, you can review API design without going through a line-by-line audit of the code too.
  • As I mentioned earlier, I work on a pretty small team.  Patches to GeoServer from developers who don’t already have commit privileges are fairly rare.  I might appreciate code review tools more if I had more occasion to assess whether or not a particular change was up to par for the project I’m working on.

Types In Stereo

Posted in Development,Ideas by dwins on July 27, 2008

There seems to be a lot of (sometimes conflicting) stereotypes about software developers.  They don’t like to talk to people, instead preferring to hide away someplace with no windows, lit only by computer displays.  They don’t care about their appearance and seldom shower.  When they do talk, computers are the only thing they know anything about.  They have atrocious senses of humor.  They like to create listings of how things are defined, especially themselves and the things they work with.

Personally, I try to shy away from this sort of thinking.  I mean, the logical extreme is to become that guy posting on Slashdot about having Asperger syndrome in a bragging tone, which is, well, not that attractive prospect.  (Note to the bored: I didn’t dig too hard trying to find some example comments, but I’m sure they’re not too hard to find.  http://ask.slashdot.org/article.pl?sid=03/06/20/1237229&mode=nested&tid=134 should be a good start.)

However, there is a grain of truth in a lot of these sentiments about the techies. I mean, I’m not too antisocial, and I hated working in a windowless room a couple of years ago when that was what I was doing, but I do kind of hate thinking about my appearance (I shower regularly though!!) Especially though, it’s pretty understandable that software guys talk a lot about software.  It’s not only what we work on, it’s what we work in, since for most software projects software is used to

  • manage source code
  • generate source code
  • communicate about source code
  • compile the source code to the finished product
  • test the source code

and so on.

The thing is, a lot of the stuff that we work with requires a good deal of familiarity with a wide range of technical concepts.  It does take someone who really loves to work on code to be good at working on code; and it does take a special kind of person to love working on code.  Because honestly, tweaking a few lines of a text file, then hitting [Run] and hoping everything works so you can tweak some more lines that will probably break everything is a pretty painful experience.  It’s one of those things that I love to have done and don’t especially love doing.  So I (and I presume most software guys) end up spending a lot of time thinking about techniques, practices, tools, policies, and other ways of improving that experience.  It doesn’t always leave time for other stuff.

Disclaimer: The Open Planning Project is full of developers who write novels and raise children and cook experimental dishes in their spare time.  I’m not saying it can’t happen, just that it’s no big surprise when it doesn’t.

Next Page »