Groovy folks, time to start agreeing

February 27th, 2015

I wrote about the drama unfolding in the Groovy project a month ago.

I left that topic for a while, but I was pleased to find out today that the question is no longer whether they need to move to a foundation, but rather which foundation it should be. There’s an email thread that has 188 messages and counting, going for more than 2 weeks, where the community is trying to figure out where to go.

I feel a bit of deja-vu, and I feel like I know exactly what’s going on. And no, I don’t mean the Jenkins drama, though we were in a relatable situation. Instead, I was thinking when I and my wife bought a house 5 years ago.

So there were two houses we really liked. The white house and the beige house. Like any informed buyer would do, my wife and I start disecting pros and cons. The white house felt a lot brighter and airy than the beige house, but the beige house has a bigger backyard. The white house is closer to a hospital, which might mean more noise. The beige house has a big tree nearby, and the gutter might fill up. List like that went on and on.

Now, the thing is, at a certain point, a list like this gets more confusing than helpful. I can add up all the pros and cons, but it doesn’t help me getting any closer to the decision making. In fact I’m no longer sure if the white house I wanted was really such a good idea. After all, it has no less than a dozen things listed under “cons”. I can see that my wife is getting just as confused as I am. The conversation starts to go in circles. I was lucky enough that our parents were living in the other side of the Pacific ocean, so we kept this conversation to ourselves. Otherwise, I’m sure it would have been even worse. They mean well, but sometimes too many opinions are more harmful than helpful.

In 1am in one of those long nights, I finally realized that agreeing on something, anything, is more important than figuring out the absolute best house out of two. So like every husband would do, I started trying to talk myself out of the white house I originally wanted, and speak about things I liked about the beige house that my wife originally wanted. I tried to downplay the concerns she had about the beige house.

Even though I started doing this consciously, the strange thing is that I started getting convinced by my own arguments that I didn’t fully believe in. Sure, the beige house doesn’t have the attic room that’s going to be my LEGO room, but I can get a brighter office and maybe I should keep my LEGO there so that I can occupy myself if meetings get boring. And in the end, we bought the beige house and we still live there mostly happily.

I feel like it’s time for Groovy guys to start doing this “let’s agree, whatever it is” dance. Judging from the conversations, I think they’ve figured out that they can live in any of these three foundations. The trick is not to get caught up on all the gory details, because there will be always something you don’t like. Yes, voting might be burdensome. Yes, losing @author tag might be annoying. Yes, infra migration would be painful. But you’ll be all right, and you’ll get used to it sooner than you think.

It’s time for people in the community to give the project leaders a blank check. I think we should be able to all trust them that they have the best interest of the project in mind.

And more importantly, it’s time for the project leaders to start converging. You guys need to sense where your consensus is heading to, and try to talk yourself into it. Try to create an echo chamber.

Engineers aren’t the best people to do this, but you guys really need to do this, because the clock is ticking. The difference between foundations is relatively small, but the difference between moving forward and procrastinating is huge. It’s a part of the leadership responsibility to form a consensus and then turn around and sell that to everyone, so that everyone feels better about what’s being done.

Remember, there’s really no wrong answer. Just different correct answers.

misc

POTD: ExceptionInInitializerError logger

January 28th, 2015

It’s been a while I’ve done a project of the day, but here it is, the fruit of my yak-shaving today.

The problem I was trying to solve today was java.lang.NoClassDefFoundError: Could not initialize class Xyz. When a Java class fails to initialize, the first attempt to do that causes ExceptionInInitializerError, but subsequent attempts to use that class results in this rather unhelpful java.lang.NoClassDefFoundError: Could not initialize class Xyz without the chained exception.

This problem has been rpeorted to Java for years, but probably JavaSE people doesn’t understand how painful this is in a large modular system, where the initial exception can be reported in so many places — such as one of the 1000 builds you’ve done today, or in an HTTP response to somebody, stderr, logging, or getting swallowed by empty catch block.

So I wrote a little Java agent that uses java.util.logging to log every ExceptionInInitializerError at the point of instantiation. In this way, even on a server, you have one place you can go to check for all errors of this kind. Through j.u.l, you can write a custom Handler to report errors elsewhere, if you want to.

The number of people who will find this tool useful would be probably small, but I hope they’ll really appreciate this little gem. May Google let them find this page.

potd ,

Groovy project should have a clear governance structure

January 20th, 2015

I just came back from Tokyo to learn that the Groovy project is looking for a new home. Related posts from the project leaders here, here, and here. Hacker News commentary is here.

This news hit close to home for me for several reasons. For one, I like Groovy a lot myself, to the point that I have developed several projects around it (like this and this.) Two, the Jenkins project uses Groovy a lot in various places. I think Groovy has a number of things going for them, such as great IDE support (and the optional static typing boosts this), greater compatibility with Java, and an existing ecosystem. So my best wishes to them.

One of the things I realizing while reading the news is that I actually don’t know the governance structure of the project. Is the name “Groovy” trademarked? If so, who owns it? How about the domain name? How is the decision making done? Who becomes committers?

Questions like this are important at times like this, because we don’t know how much is owned by Pivotal, and that determines how much is up in the air.

For example, had Groovy been an Apache project, then we’d know the answers to all of the questions above, and this news really just boils down to who would be willing to hire the team and let them work on the project. Maybe they won’t find a single company that is willing to hire everyone and put them all full time on open-source Groovy — I bet they talked to prospects in private before going public, so at this point I assume the chance of this happening is pretty low. But I have no doubt they will each find a gainful employment that does involve some open-source Groovy work.

In contrast, if Groovy is a company-sponsored open-source project like Spring was, in that the company owns all the key assets and dictates the development process, then we have a lot bigger problem at hand, because there’s greater uncertainty. The project would have a bigger risk of fragmentation (for example if the team gets hired by different competing companies.) Perhaps license would change.

This is why I think the ownership of the project should be thought separately from the ownership of the developers (i.e. who pays the salary of the key contributers of the project.) When the latter changes, having the former sorted out considerably reduces the impact. And this comes from my own personal experience dealing with Hudson/Jenkins. This is one of the reasons why the Jenkins project has a governance structure laid out.

So I’d like to encourage the Groovy project to sit down and clarify the governance. This should be a part of their “find a new home” plan. Maybe they could even just join Apache Software Foundation since the license is compatible, maybe they could come over to SPI, where Jenkins is.

At this point, I’d be very happy with a less active Groovy project without a corporate sponsorship. But I wouldn’t like to see a governance-less Groovy project. I think they can avoid that.

Update: I’ve written a follow-up post on this topic

misc , ,

今週末から日本にいきます

January 6th, 2015

今週後半から来週末まで日本に行きます。

メインイベントは1/11のJenkins User Conference 東京です。まだまだ参加できますのでぜひ宜しくお願いします。懇親会もぜひ参加してください。

月曜日にはJUCにドイツはBMW Car ITから来てくれるゲストスピーカーと一緒に東京観光をしようと思っています。Jenkins界隈の人で一緒に遊びに行ける人はぜひご一報ください。

休みが明けた火曜日の夜にはJJUGのJenkins祭りが!日曜日はこれないよーという方はぜひこちらにお越しください。

その週はあちこちの会社を訪問してCloudBeesの日本上陸に奔走します。もしその手の話に興味がある方がおられればぜひ個人宛にメールしていただければと思います。

1/17, 1/18の土日はオフなので、久々の日本をエンジョイしたいと思っています。日帰りか一泊でスキーにでも行くか。誰か一緒に遊んでもいいという人がいればぜひ声をかけてください。

jenkins ,

POTD: random but meaningful name generator

April 6th, 2014

I’m working on the automated blackbox acceptance tests for Jenkins, where I often need to generate random unique names. The code has been using random number generator to generate such names, but as I was debugging test failures, it became painful to remember those random names.

For example, a test might create two new jenkins jobs “random_name_155230″ and “random_name_137204″. Now which one was supposed to be the upstream and which one is downstream? Aside from a few exceptions, humans are generally not good at remembering those random numbers.

So I thought it’d be a lot better if these names are more meaningful, like “constructive_carrot” or “flexible_designer”. That is, if I have a decent sized corpus of N English adjectives and M nouns, I can generate NxM unique names (and induce a few chuckles to whoever see the generated names.)

After a bit of googling, I came across wordnet, and I took a subset of its corpus to come up with a small library that generates human-friendly random names. It has about 600 adjectives and 2400 nouns, resulting in 1.5 million unique names before the generator wraps around.

You’d use this library like this:

RandomNameGenerator rnd = new RandomNameGenerator(0);

while (true)
    System.out.println(rnd.next());

The code is under the BSD license with no advertisement clause, and the library is on Maven Central. Hope you find it useful.

potd , , ,

POTD: Application configuration via Guice binding + Groovy

March 1st, 2014

Often I write my applications with Guice. I also often want to make those applications configurable externally. For example I might inject username and password for that app to talk to another app, I might configure some timeout value, and so on. I make these configuration values available in Guice, so that I can access them wherever I need them. All of this is pretty common in many other places, I’d imagine.

Given that all I’m doing here is to pass configuration values from left to right, I thought it’d be nice if I can write configuration directly as a Guice module by using Guice binder EDSL. Then I won’t have to parse and translate these configuration any more.

And that became my project of the day.

This little library allows you to write Guice binding definitions in a text file:

timeout = 3
bind Payment named "customer" to VisaPayment

From your program, you use GroovyWiringModule to load this configuration file:

Module config = new GroovyWiringModule(new File("/etc/myapp.conf"));
Injector i = Guice.createInjector(
  Modules.override( ... my application's modules ...)
    .with(config))

The end result is that the above script gets translated into the following binding:

bind(int.class).annotatedWith(Names.named("timeout")).toInstance(3)
bind(Payment.class).annotatedWith(Names.named("customer")).to(VisaPayment.class)

Using Groovy as the host language for DSL has other benefits. If you are using system properties or environment variables to configure something, you are basically stuck with strings as the only representation of the configuration. With Groovy, I can create a relatively complex object and bind them, or even put some logic to further obtain values from elsewhere:

bind Payment toInstance new VisaPayment(
  cardNumber: "1234-5678-9012-3456",
  expiration: new Date(System.currentTimeInMillis()+TimeUnit.DAYS.toMillis(30),
  cvv: new URL("http://secret.server/cvv").text) 

With the functionality in Guice to override definitions in one module by another, I can also even override bindings defined in programs, for example to get more logging, add a filter, etc.

potd , , , ,

POTD: cucumber annotation indexer

February 27th, 2014

Cucumber for Java requires that you specify the packages in which your step definitions exist. At runtime, cucumber uses some hack to try to list all the classes in this package (it’s a hack because class loaders never really support the listing operation), loads them one by one, and finds those that have step definition annotations like @When and @Then. This is both poor user experience (can’t you just find my step definitions!?) and poor performance (loading all the classes under a package is expensive.)

So I wrote a library that offers a much better alternative. It uses annotation indexer to create an index of step definitions and hooks at compile time. Thanks to JSR-269, this happens automatically on Java 6 and later. With the index in /META-INF/annotations, runtime can load all the step definitions quite efficiently.

The library contains a Backend implementation, so you should be able to just add it to your project dependency, and cucumber should automatically find this (and thus all your step definitions and hooks.)

By the way, this horrible technique of scanning jar files, listing class files, and finding annotations from there is unfortunately commonly seen in many other libraries. This was a necessary evil in the days of Java 5, but it should really die in this day and age. If you realy on the classpath scanning, please switch to annotation indexer, which provides the backbone functionality of this POTD.

potd , , , ,

POTD: no more tears

December 14th, 2013

In a modular Java program or in a large Java project that has lots of dependencies, you often end up a version of library that’s different from the version used to compile the code.

This often results in LinkageError, where a method/field that was present when the code was compiled do not exist any more in the version being loaded at the runtime. This restriction applies to seemingly trivial safe changes, such as changing the return type of a method to the subtype of what it used to be.

Previously, the only way to deal with this is not to remove any signatures that matter. In other words, you count on library/module developers to be more disciplined. Over the time, Java programmers have accepted this as a way of life, but there are some notorious offenders (Guava and ASM, I’m looking at you.) Besides, it makes it difficult to evolve code.

I have dealt with this in multiple different ways in the past.

The bridge method injector is an example of static compile-time transformation. This kind of tool is non-intrusive to the users of a module, which is good, but it’s still a tool for library developers to be diligent, and processing at compile time means it has only limited information to operate on.

The bytecode compatibility transformer is an example of runtime transformation. This has a lot more information to let it do the right transformation, but modifying class files on the fly requires a custom classloader, which limits its applicability.

On the way back from my recent trip, I realized there is the 3rd way to achieve the same effect — invokedynamic. You see, invokedynamic is really just a mechanism of deferring the linking to the runtime. This allows me to combine the benefit of two approaches. I can transform class files at the compile time without really deciding how the references are linked. Then at runtime, I can decide how they actually get linked but without a need of runtime transformation. The only downside is that it requires Java7.

But in any case, I thought the idea was clever, so I implemented it as my “project of the day”. Please let me know what you think.

potd , , ,

Why I hate Zendesk

August 30th, 2013

In Jenkins project, I use JIRA. And for CloudBees customers, we use Zendesk. And I positively and passionately hate Zendesk. It’s quite painful for my usage, and even more disappointing as many of the things that make me suffer can be fixed so easily. So I decided to write this to make myself feel better.

The way we use Zendesk, it’s really not that diffrent from the way Jenkins JIRA is used, which probably applies to just about any software projects. People report strange behaviours in software, stack traces, log files, etc. We look at those, sometimes ask for more information, ask them to try a new binary, get a new data, and repeat this process until it runs its course. I don’t think we are that unique.

This kind of support work is a lot like a detective work. You look at various log files and stack traces, you connect dots, and you derive a certain hypothesis. Sometimes additional facts are discovered later, or a hypothesis is proven wrong, and you start digging in another direction.

When I’m doing this, it’s very important that I “cite” my sources correctly. I’d like to be able to say “the stack trace in comment #7 indicates that X is doing Y when Z. This is consistent with ticket #150 comment #9.” But I can’t do this in Zendesk, because they do not have any sequence number nor ID for comments. So the best I can say is “comment made by John 11:59 on Apr 13th 2013 in ticket #150″, and if somebody (like me 6 months from now) wants to see the referenced comment, he has to manually find that comment. This is like writing a research paper without a bibliography section, or how a sloppy program managers refer to tickets (“what happened to that ticket — you know, the stability issue from Acme Corp a few weeks ago?”). This is particularly maddening because obviously they have an unique ID in their database, and I can even see it in the HTML source. It’s really only a minor additional work to make comments referencible. And yes, JIRA has permalinks for comments for long time.

There’s a lot of useful features to be built in this direction — for example, I’d love to see log files visible online, in such a way that each line is separately addressible as URL, and allow agents to left comments, much like how you review code. In this way, I can leave the audit trail for my detective work so that others can retrace that later.

Another thing that bugs me is the inability to edit comments that have already been posted. Zendesk supports markdown, which is good; it helps you organize information a lot more than just plain text. But some customers don’t know that, and file a ticket in plain text. Similarly, when the ticket comes from the e-mail gateway, often line wrapping and etc messes up the format.

In JIRA I often edit the ticket description to fix this up — convert a section of the text to a fixed-width font, remove line wraps, delete the pointless e-mail footers, etc. Have you ever had someone who posted 1500 lines of log records as a comment? Move that off to an attachment, and edit that out! In other situations, I want to go back to my earlier comment and strike out something, for example when my earlier hypothesis turns out to be wrong. But I can’t do that either in Zendesk. (And Zendesk doesn’t let you collapse a comment, either.)

I can go on forever, but I should be working, not whining, so I’ll finish this up with just one more problem; inability to edit a closed ticket.

Since CloudBees uses Zendesk for supporting paid customers, developers are expected to bring tickets to completion, to the “closed” state. I’m pretty sure we aren’t unique. The problem is that once a ticket is closed, one cannot make any further updates.

Say another ticket was filed later by somebody else that is seemingly related? Can’t add a comment to link to it. If a detective work led to an old “unresolved” case and you came up with a new hypothesis that might explain a behaviour? Can’t put it there. Adding a new tag? Nope.

My colleague Ben Walding made an observation that Zendesk probably has a different kind of audience in mind — ones where a relatively unsophisticated “level 1″ support people sitting in an office somewhere in Asia goes through 100s of tickets every hour as fast as possible. In other words, Comcast or AT&T kind of support. Prominence given to features like a templated response backs up this assumption. I can only assume Zendesk works well for those use cases.

But if you are considering Zendesk and your use case is more like ours, the sophisticated software support, then I hope you learn from our mistake and stay away from Zendesk.

misc , ,

LEGO Earth project

August 13th, 2013

I play LEGO a lot with my daughter, and I’ve been obsessed with making spheres. My last sphere building attempt was building a mini LEGO Earth in 2009. While I enjoyed the project, it was also clear to me that there is a room for improvements.

The main issue is that the the “pixel resolution” of LEGO is rather low that many of the distinctice coast line shapes are just not visible. I also later discovered off-by-one bug in the program that I used to compute the color assignment to tiles, which skewed the projection around seams.

So around March last year, I decided to build a bigger version of it, at the whopping 3x scale of the earlier project. This will surely give me enough resolution!

The construction starts with building a 36x36x36 stud cube from LEGO technic beams. The studs are facing all 6 directions.

The lumps at corners are for changing the stud direction as well as for supporting the structure.

The idea is to basically build six peels and put them on the surface of this cube, completely covering the cube:

Each peel is big enough that it needs its own support structure. When attached to the cube, the center of each peel will not be connected to anything, so it needs to be fairly sturdy. It would be fun to write another program that does the structural analysis and figure out how much support would be adequate. Perhaps a TODO for the next version. The clipboard you see on the left is the instruction computed by my program

The support structure is mostly made of 2×4 bricks. Just for this purpose, I bought several 1000s of them:

On this support structure, I then built up the arch over it. One section is almost complete!:

One thing I wasn’t too careful about in the planning phase was the seasonal change of the Earth surface.

The image I based the texture mapping on didn’t have any sea ice, so I use a different source to add in sea ice. Unfortunately, those two sources did not agree on the season. If you look carefully, the sea ice is the winter level, but the land ice is the summer level:

On the positive side, the higher resolution really paid off. Here are two Earths side by side, showing the same side.

We are used to seeing the Earth as a two dimensional map projection, so our mental image of the Earth is quite skewed. Building an actual sphere was quite an enlightening experience, as I get to really look at the actual proportions of things. For example, I used to think that when I fly from the US to Europe, I’m going about one third of the Earth, but it turns out it’s more like just one fourth. Africa is really, really large, and the mouth of Amazon river is so vast that the brown area caused by the river-carried dirt stretches a hundred kilometers. For that I gave is a distinctive brown tile:

This version of Earth is about one meter across. It’s quite heavy, but I can still carry it by myself.

Once I assembled it, I still had some more fun by calculating various numbers based on this scale.

For example, the highest point of the Earth, Mount Everest in Himalayas, is only about 0.3mm tall above the sea level. That is, it’s not even one tenth of a LEGO plate high. In fact, he whole Earth crust is only about 30km deep, so that’s only a half LEGO plate high. If I wanted to correctly model the interior of the LEGO Earth, then I have to basically build a red sphere all the way, except the top-most plates. It’s almost like we are living on the egg shell, and no wonder the continents moves over time!

In the same scale, the Sun would be about 50m (200ft) in diameter at 6km / 4miles away. That’s big enough to comfortably fits the Statue of Liberty inside. Someone please build that in LEGO and we will put that and my Earth together at the accurate distance!

Finally, if you put this Earth in my home at San Jose, the furthest man-made object, Voyager 1, is past San Diego and into Mexico, flying at the speed of 1.5m/h. Picture a garden snail that crawls past the LEGO earth, slow that down 100 times, and that’s about the speed of Voyager 1. No wonder it took 35 years to get there — go Voyager!

Anyway, that’s it. Now that I’m finished with this project, what am I going to build next?

lego, misc ,