Archive

Posts Tagged ‘git’

The other side of forking and pull requests

January 4th, 2013

Charles Nutter of JRuby fame had this tweet yesterday:

And this touches on something I’ve been thinking for a long time.

My experience mostly comes from the Jenkins project, which is one of the open commit policy projects. Everyone gets to be a committer just by asking, and that grants them access to all the plugins. This policy is deep in our culture that we even have an IRC bot that the project people use to add new committers.

This is not necessarily a good idea for every project, but it certainly served us well, especially back in the days when we had the entire source code in the Subversion repository. During that time, for people to make changes to the plugins, they really needed to be a committer — maintaining a custom patch set on top of an external Subversion repository is a royal pain in the back side. The open commit policy means anyone interested in committing can join the project without a fuss, and once they become a contributor and start commiting code, they feel like being a part of the project, and that feeds fresh blood into the Jenkins developer community.

While I totally believe in the technological superiority of Git as a version control system over Subversion, in my view, the fork and pull-request driven world of Git hasn’t been a complete win for the Jenkins project.

Forking and pull requests do encourage contributions by eliminating certain barriers to start hacking on someone else’s project. Whereas it was quite painful to maintin patches externally to someone else’s source code in Subversion, Git makes this trivial with branching and occasional rebases or merges. This part is a good thing.

On the other hand, this ease of maintaining your own versions reduces the incentive to bring your changes back to the community, to some extent. After all, you’ve already scratched your itch and it works on your machine. Why bother trying to push the code back and go through the trouble of writing tests and documenting the change, convincing others about the design, and arguing why that change was needed to begin with. You can see this in a number of forks people created that never turns into as pull requests, and I’ve done this a number of times myself.

An even bigger problem in my mind is that by allowing people to contribute by just sending pull requests, it encourages the “throw the stuff over the wall” way of contributing. Sometimes people can do stuff without telling us anything upfront, and you as a receiver would be hard-pressed to turn it down, given that someone spent so much effort in it. Sometimes when we request additional changes, they never come back. It makes the collaboration and communication after the fact, not before the coding. And it also makes the pathway to the committer bit unclear. How do we convert people from the side sending pull requests to the side receiving pull requests? If we have to ask them one by one it won’t scale.

The situation gets worse on plugins, where there may not be an active maintainer at the moment. We’d like those who are sending us pull requests to take over the maintainership, and use their best judgement to evaluate the changes. But when GitHub makes them feel like the proper way to contribute is by sending a pull request, and if your pull requests do not get any attention, I’m not sure how they are supposed to realize that the plugin needs a new owner. Back in the days of Subversion, the repository alone was enough to be a center of gravity of the project. It attracts new developers (so long as we let them be, aka open commit policy) and it’s almost self sustaining. Now it takes more than just the repository.

In any case, I just wanted to point out that the impliations of forking and pull requests are bit more nuanced.

P.S. I’ve been long meaning to write a bot that patrols for such unattended pull requests, and encourage them to become committers, but haven’t had a chance to do so yet. I still intend to do this, and we’ll see how it goes.

jenkins , , ,

Jenkins Git Server Plugin

August 31st, 2012

Jenkins Git Server plugin is a so-called “library plugin”, which doesn’t offer any user-visible feature by itself, but instead enables other plugins to do something easily inside Jenkins. In case of Git server plugin, it allows other plugins to easily embed Git server functionality (via JGit) — create/manipulate Git repositories in the Jenkins server, expose it via SSH and HTTP transport for push/pull, and maintain local check out of those repositories.

As an example of how to use this plugin, I wrote Git userContent plugin. This plugin exposes the $JENKINS_HOME/userContent directory as a Git repository, and enables administrators to use git to push/pull changes and manage them with history.

In terms of code, there are two classes that plugins like git-userContent-plugin should be interested in.

One is HttpGitRepository, which represents Git repository access via HTTP. Typically you have some directory inside $JENKINS_HOME that houses the repository, then you subtype GitHttpRepository and override abstract methods to fill in the missing details. FileBackedHttpGitRepository is a convenient default implementation that simplifies this further. GitUserContentRepository in git-userContent-plugin is an example of using this class. This use also implements RootAction to bind this repository at http://server/jenkins/userContent.git, and I expect this combination to be fairly common.

The other class of interest is RepositoryResolver. Git server plugin adds necessary Jenkins SSH CLI hook for exposing Git repositories over SSH. The only missing link here is that when the client runs “git clone ssh://server/foo/bar/zot.git“, we need to figure out what repositories on the server corresponds to /foo/bar/zot.git, and that’s what the RepositoryResolver extension point does. The sample implementation in git-userContent-plugin will be hopefully self-explanatory. In this case, GitUserContentRepository is a singleton (because it’s RootAction), so we inject that and basically just delegate the calls to it.

I’m looking forward to seeing more plugins take advantages of this feature to expose data over Git repository. I think there’s a lot of interesting uses to it.

jenkins

Push changes directly into BuildHive, and never run tests again!

June 5th, 2012

On top of pull requests auto-build, BuildHive now allows you to push changes directly in via ssh. I call this feature “validated merge”.

If you are an active developer of a repository, chances are that you don’t use pull requests to send in changes. You probably just push changes directly into your repository instead. But one of the common problems of doing this is that if your change breaks the build, you are going to discover it too late. With validated merge, this problem will never happen again, and this is how it works.

First, you add the buildhive user as a collaborator to your repository, so that it can push when the build is done. I didn’t feel comfortable doing this automatically, so you need to do it explicitly.

Second, login to BuildHive (just once is suffice, so that it learns your public keys you registered with GitHub), head to the job page in BuildHive (like this), and click “Git repository for validated merge” link from the left. Click the copy button to copy the “git remote add” command, paste that in your shell, and execute that on your local repository. This will add BuildHive as a remote repository.

Preparation is now complete. Go modify code like you normally do by creating commits, and when you are ready to push, push your changes to “jenkins” instead of pushing to “origin”. If you are on a branch, replace “master” with the branch name you are working on:

$ git push jenkins master

BuildHive will check out the current tip of the specified branch in GitHub, merge that with the changes you just submitted, then do the build and run tests. If the merge result successfully completes the build and tests, the result will be pushed to GitHub by BuildHive.

If you are unlucky and your changes weren’t as good as you thought, you’ll learn that it didn’t pass the builds. You then stash your work, come back to the broken changes, and rework those. You probably don’t remember which commit you pushed to BuildHive, so BuildHive provides a tag you can fetch into your workspace to get to it.

$ git stash
$ git fetch -n ssh://anonymous@buildhive.cloudbees.com/kohsuke2/sandbox-ant tag changes/45
$ git checkout changes/45
$ … make edits ...

You can add more commits to correct the problem, or you can even amend your commits — since your changes didn’t land in the repository, amending, rebasing, and so on is just fine. You can then re-push your changes and then go back to what you were working on before interrupted by a failure.

$ git push jenkins HEAD:master
$ git checkout master
$ git stash pop
… resume the work you were doing ...

If you’ve already headed home, your colleague can do this workflow for you.

Your time is precious. Don’t waste it by watching your laptop to complete tests. Instead, let BuildHive do it for you!

By the way, this is a feature already available in Jenkins Enterprise by CloudBees, so if you like the workflow but your source code isn’t in a public GitHub repository, you can do the same thing with any Git repository inside your firewall.

jenkins , ,

Polling must die: triggering Jenkins builds from a git hook

December 1st, 2011

As I keep saying, polling a repository from Jenkins is inefficient; it adds delay on the order of minutes before a build starts after a commit is pushed, and it adds additional loads. It is much better instead to do push-notification from the repository. In this post, I’m going to explain how to do this for Git, which brings this on par with Subversion.

The previous best practice of doing this is best summarized in this blog post. While this works, this is less than ideal. A part of the problem is that this requires hard coding of job names inside the repository hooks, making it hard to keep them up-to-date. Another problem is that if your job only cares about one branch in a busy repository, you don’t want a new build to be triggered. Finally, the last problem is that you need some extra work for secured Jenkins.

With the latest Git plugin 1.1.14 (that I just release now), you can now do this more easily by simply executing the following command:

curl http://yourserver/jenkins/git/notifyCommit?url=<URL of the Git repository>

This will scan all the jobs that’s configured to check out the specified URL, and if they are also configured with polling, it’ll immediately trigger the polling (and if that finds a change worth a build, a build will be triggered in turn.) This allows a script to remain the same when jobs come and go in Jenkins. Or if you have multiple repositories under a single repository host application (such as Gitosis), you can share a single post-receive hook script with all the repositories. Finally, this URL doesn’t require authentication even for secured Jenkins, because the server doesn’t directly use anything that the client is sending. It runs polling to verify that there is a change, before it actually starts a build.

One more final note — in Git, unlike Subversion, a repository does not have its own identity, and a single repository sometimes have multiple URLs to access it. In such a case, simply execute the curl commands multiple times with all the different URLs.

That’s it. I hope this will help reduce the use of polling and have more people switch to push notifications. It really is addictive to see the build start in a split second after you push a change.

jenkins , ,