Protect your Site from Bots and Help Digitize Books

June 1, 2009

Having investigate multiple different CAPTCHA solutions to prevent spam bots or “not so serious” visitors from registering for a site’s account, I am can highly suggest reCAPTCHA. It is a very secure implementation that will also make good use of the time people spend entering answers to challenges by using part of the answers to digitize books.

Advertisements

12 Steps to Better Code – Revised, now 7

May 11, 2009

Note: This is a work in process, but I wanted to get it out “as is” to see what others think.

Unit testing is now such a paramount part of producing quality software that its absence from the list alone necessitates a revision of Joel’s 12 Steps to Better Code. There are other enhancements to include based on what we have learned in the almost 9 years since their original publication. Also, 7 Steps sounds sexier.

So here’s my revised list. Use these steps to evaluate a software organization the same quick-and-dirty way Joel describes in his original article.Remember, there is no silver bullet.

  1. Do you have (something at least resembling) a spec that solves a real user/business problem?
  2. Do your developers have frequent code design sessions and mutual code reviews?
  3. Do you use unit and regression tests that are very easy to write and execute?
  4. Do you have automated continuous builds that include running all unit and integration tests?
  5. Do programmers have quiet working conditions and regular, 40-hour-week work schedules?
  6. Do you have testers with input in the development life cycle and the requirements?
  7. Do you have an integrated system for bug tracking, development tasks, source control browsing/statistics and a wiki?

Compared to the original list, I enhanced and combined some of the original items, added a few and re-ordered the whole list. I also dropped a lot (see below). In addition to applying my own experience as a software development team member, I integrated a few ideas repeated in other comments on the original list. I purposefully avoided buzz words such as agile or pair programming, because they are mostly polarizing and would distract from the real goal of this (and the original) test: evaluate the quality of your software development process and have a simple guideline for improving it.

Before I go into detail about the items on the list, I want to comment on what I dropped:

Original 1. Do you use source control? From a perspective of necessity to the software development process, using source control, even if you are the only one on the project, is equivalent to providing electricity to your computer. I am not going to put the latter on the list, either.

Original 2. Can you make a build in one step? I removed this for two reasons: First, it is inherent in the new step 4. Second, tools for doing very sophisticated things with builds are readily availably nowadays (Ant, NAnt, SCons, Cruise Control …).

Original 5. Do you fix bugs before writing new code? This did not really get dropped, but it may be hard to recognize as inherent to step 3 of the new list. More details on this below.

Original 6. Do you have an up-to-date schedule? This is not always applicable based on context and it’s also a very controversial item with no simple solution. Most software teams have experienced that meeting a schedule is a rare event, even if you keep moving the delivery date. Some teams try to solve the problem with frequent, small release cycles; other teams try to find different solutions. In essence, how this problem gets solved has often more to do with actual business requirements than with the quality of your software development process. Adhering to the 7 steps in the new list while at the same time meeting business requirements will essentially force this problem into a particular, workable solution. Also, since step 4 is essentially assuring that there is (almost) always releasable software, it may provide management with an opportunity to make more flexible release decisions.

Original 9. Do you use the best tools money can buy? Depending on your context, the best tools are actually free nowadays. You should still have the best tools, but what they are is so specific to a particular team’s requirements that it is quite difficult to evaluate a team based on the ones used.

Original 11. Do new candidates write code during their interview? As much as I would like that to happen, team environments differ so significantly that this just is not a reliable indicator for a functioning software development process. What about open source projects that don’t hire people? There are many ways to evaluate a new team member and integrate him/her in the existing team process that are better than the traditional interview. One is to have a conditional agreement and get the new guy/gal to write code.

Original 12. Do you do hallway usability testing? What if there is no hallway? Teams differ too widely in their structure to make this an indicator for a good process. The requirement that the end-result of a functioning process must be usable is embedded in the new steps. But usable does not always mean usable by an end-user. A software component, for example, must be easily usable by other software developers; “hallway” testing in the way Joel described it is not going to help there. Steps 2 and 7 on the new list may.

The new 7 Steps to Better Code

1. Do you have (something at least resembling) a spec that solves a real user/business problem?

This is the most important step, period. If you find a team that writes software without this, run. Not having to solve a real, existing, validated user or business problem will guarantee that requirements change in such extreme ways that they cannot be reasonably accommodated or even anticipated with any of the other 6 steps. Schedules will never be met (because early releases will demonstrate incorrect or insufficient requirements) and everyone will be frustrated from very early on. It’s a vicious cycle. It’s like taking a road trip without an idea where you are going, just because it sounds like fun. But everyone in the car will be screaming and cussing.

I am repeating this in different words because it’s so important: Make sure that what your team is building is needed. It could be written on a napkin, but it’s has been validated to be a real need. Otherwise, don’t bother.

2. Do your developers have frequent code design sessions and mutual code reviews?

This step migrated from the bottom of my first draft to almost the top because besides having something real to build, I have come to experience this step as absolutely fundamental to building quality software. Frequent design and code reviews may be new to some and not be going far enough for others. A third group of people might experience some serious discomfort on reading this, because it affects the developer’s ego. Even if you are a team of one, have someone else look over your ideas of how you want to build software. Doing so causes several things to happen:

  1. You will seriously think through the problem first before writing any code, because you will have to talk to someone else about why you think your approach is correct.
  2. While working on 1., you will uncover any deficiencies in the requirements given to you, before you wasted time writing code solving the wrong problem; even before you wasted another programmers time reviewing your design.
  3. Both code design sessions and code reviews will dramatically enhance code quality. I consider myself a decent software engineer, but none of the stuff I come up with on my own is half as good as what I produce if I cooperate with another developer on it, even if the other developer is less skilled.
  4. Less skilled developers will quickly improve their coding and design skills as a result of the frequent interaction with more advanced programmers.

I have experience with pair programming and love doing it. But, in today’s typical software business environments, it simply is not reasonable to expect two people to sit next to each other 6 to 8 hours a day while only one of them is writing code. An average of one hour per developer per day doing code design sessions and mutual code reviews with feedback will go almost as far as pair programming in code quality improvements and team education/IP transfer, with much lower personnel costs.

3. Do you use unit and regression tests that are very easy to write and execute?

Even though I frequently interact with developers who see unit testing as an afterthought, my own experience with Test-driven Development has proven to me that this is by far the best way I, as an individual, can produce quality software. In order for the individual programmer to even be able to use TDD, the software process/environment must make it extremely easy to write and execute tests. This has further-reaching implications than most realize:

First, the tools for unit testing must be available, and if they are not, they need to be devised first. This is true whether you are writing a web application, a scientific application, an embedded or mobile application or a device driver. Your ability to take the smallest unit of code and test it independently of the rest of the system will enable you to produce very high quality code.

Second, as a corollary, this approach forces you to design your software system so that you even can isolate every small unit of functionality. That is very important for future code quality: On the one hand it makes the components of your system as independent as possible, allowing you to easily replace each of them with new implementations down the road to accommodate new requirements. On the other hand, it enables you to enhance code coverage as well as write robust regression tests down the road, which is important, because it is unlikely that you are starting out with 100% statement or branch coverage.

Third, in combination with the “pass” requirement described in step 4, this step means that every new bug gets a regression test that must pass before new software can be written.

Forth, the entire system needs to be flexible and should be able to be “brought up” automatically into which-ever state is needed. See the next step for more about that:

4. Do you have automated continuous builds that include running all unit and integration tests?

Build in this context means something that automatically composes your system in its entirety and executes a vigorous functional validation test suite against it. It doesn’t just mean compile, which is not applicable in many contexts anyway. There are some contexts in which even the automated acceptance tests have part of the system stubbed out, for example when writing embedded software. But if you can feasibly devise a system that is equivalent to the final deployment, even if it takes extra work, then you should make the effort. For example, if there is any way that you can get your embedded code to automatically deploy to that FPGA chip you are trying to program, you should attempt to make it happen and write a test suite against the actual chip. If you are writing legacy FORTRAN code, find a way to emulate your entire system and run the tests against that.

+ whatever Joel had to say on the one-step build.

There is no excuse for not doing continuous testing.

The principle behind this step is that new code must never break the build; and if it is broken, it’s the first thing that gets addressed before anything else. As a consequence, bugs are always fixed first, because a new regression test for an existing bug would break the build until the bug is fixed.

The immediate business benefit is two-fold: On one hand, this approach provides a level of business process stability and security, because business owners can asses the state of the product simply by analyzing the result of the latest continuous build (which they should be able to do easily). On the other hand, it provides scheduling flexibility, because business owners can make release decisions for any of the passed builds.

Because many software projects have multiple concurrent release cycles in various release states and therefore keep multiple branches of code around, this step forces every single component of the system to be designed with the utmost flexibility. An often overlooked component are database systems, which, as a consequence of steps 3 and 4, need to be automated in their creation and management. The initial effort to getting this done right will result in significant savings down the road.

5. Do programmers have quiet working conditions and regular, 40-hour-week work schedules?

The same arguments that Joel had 9 years ago all still apply and it’s sad to see that we haven’t made much progress … despite there being so much evidence out there collected by others that support this point so sufficiently that the best I can do is repeat it:

  1. Don’t kill the zone, because it is expensive to get back into it.
  2. Remember the law of diminishing returns. More work does not necessarily mean more (or better) output. It usually means the opposite.

6. Do you have testers with input in the development life cycle and the requirements?

What good is testing if it is not used to find problems? What if the problem is not a bug in the software, but something else, for example, in the process or with the requirement? What if a usability tester can spot a usability bug in the requirements before any code is ever written? Human testing must be used to uncover all the problems that cannot be found with automated tests. Depending on a team’s composition, testers may not even write any automated tests, but simply observe or review them for correctness.

The biggest issue is if there is no independent quality control, someone to keep requirement writers and software developers on their toes to build quality software; the kind of quality control that rejects code changes if there is not sufficient unit test coverage, the kind that protects developers against silly requirements, the kind that in very few teams actually exists.

In small teams, anyone may be a tester and it may be beneficial to the quality of the final product to give everyone the ability to veto requirements, design, code or even test results.

7. Do you have an integrated system for bug tracking, development tasks, source control browsing/statistics and a wiki?

One would think this should be treated like source control, and hopefully in 9 years from now we can do that. But the current state of affairs in many teams is what I call the “SharePoint” problem: multiple bug tracking systems, word documents, spreadsheets, tables, wiki entries, lists, text notes, print-outs, JPEGs, Photoshop and Illustrator files and who knows what are littered over many servers and drives, with many duplicates and no way of finding anything. Some people try to solve this problem by installing a full text search system against the entire mess only to find out that for any reasonable search they find hundreds of unorganized documents or nothing. Or there is no reasonable search term to define.

All communication, requirements, comps, tasks, bugs, and documentation need to be in a single place. E-mail should never contain specifications. At best, it must only contain a link to a wiki page that has the specifications on it so it can be discussed there. If the source code is the specification, create a new tracking item for the discussion pointing to the code and e-mail the link to the tracking item. That requires whatever system you are using to support the easy creation of links to pretty much any item in it. At best, don’t use e-mail for this stuff at all, but have a good notification system.

For every modification to be done to the system, there ought to be a tracking item that allows visibility into the change for all stake holders: the business owner, the developer and the tester. This is true for any software project.

Without such a system, it’s hard to do any of the other 6 reasonably efficiently.


Debugging Adobe AIR Applications with Aptana Studio

February 11, 2009

Aptana’s AIR plugin is providing (almost) full debugging support for Adobe AIR.

I consider any platform production-ready when you have an IDE with full debugging support available for it. The simple reason is that being able to debug code using breakpoints and step-by-step execution makes me at least 10 times faster (rough estimate) in finding bugs or understanding someone else’s code. This is a significant difference, because it means I am either able to find and fix a bug, including writing a test for it in half an hour with a debugger versus a whole day without it.

Adobe AIR is an appealing environment for cross-platform application development, especially for a web-focused developer like me. Aptana recently released an Adobe AIR plugin for their Studio product that allows full debugging of JavaScript right from within the IDE. From a productivity-focused developer perspective, this makes AIR now an option.

Their are still a few bugs in the plugin (it’s “beta”). I am sure they will be fixed soon; one bug I filed with Aptana’s issue tracker got attention within minutes.

UPDATE: The one bug I had encountered was already fixed in their latest beta. Yeah!


HP ScanJet on Mac OS X (or any scanner, for that matter)

December 30, 2008

Using a USB legacy scanner for Mac OS X can be accomplished with this set of TWAIN/SANE tools based on the SANE implementation for *NIXes.

Since it took me a while to find what I needed to get my old HP ScanJet 5470c to work on Mac OS X (10.5), I wanted to post this link accompanied by some of the keywords for scanners that others may use (so they’ll have an easier time finding it on search engines):

http://www.ellert.se/twain-sane/

This site lists several binary packages which should make your scanner work for you, too … after you installed them in the following order:

  1. gettext
  2. libusb
  3. SANE backends (installs all, even though you may only need the HP 54XX one)
  4. SANE Preference Pane
  5. TWAIN SANE Interface

After installing all these, go to the SANE Preference Pane and disable all the backends you don’t need (at least that’s what I did; it works for me, but I don’t know if it’s necessary). Next I clicked on Configure and then OK; without this step, apparently, the SANE/TWAIN interface is not activated. After closing the preference pane and entering my password to store the settings, I opened Image Capture. A preview dialog popped up and my already connected scanner did a quick preview scan. From there I was able to scan the document on the flatbed.

If this does not happen for you, make sure your scanner is properly connected. By going to the Apple Menu > About This Mac > More Info, you get access to the System Profiler. One of the last entries in the left column under hardware should be USB. After selecting this entry, the right panel should show the USB Device Tree. Make sure your scanner shows up in that list.

If that still does not help, try restarting your Mac, although that should not be necessary.

I presume this works for pretty much any scanner listed in SANE’s Supported Devices List, but it may be of particular interest for legacy devices such as:

Hewlett-Packard ScanJet 2100c, 2200c, 2300c, 3300c, 3400c, 3500c, 3530c, 3570c, 3800c, 3970c, 4070 Photosmart, 4100c, 4200c, 4200cse, 4200cxi, 4300c, 4300c Silitek, 4370c, 5200c, 5300c, 5370c, 5400c, 5470c, 5490c, 5550c, 5590c, 6200c, 6250c, 6300c, 6350c, 6390c, 7400c, 7450c, 7490c, 7650c, 8200c, 8250c, 8290c, G3010

Update: Some people have commented with problems on Snow Leopard. One commenter suggested that VueScan works, but it’s not free. You can probably find a used scanner that does work on craigslist for less than what VueScan costs.


From No Clue to 3 Websites with Django in 1 week

July 18, 2008

I started “40 hours ago” without much knowledge of Python or Django (I had played with both for a couple hours a while back). Now I have three mini-sites using the Django framework live.

Thus, my first few experiences with Django on Python have been remarkable. Read the rest of this entry »


Model Katja Jung

February 11, 2008

This is a little off-topic: My sister just announced that she started a blog! She works a a model on the side while going to med school to become a pediatrician. I guess the blog’s purpose is to stay in touch with her fans. The blog is in German, so if you can read German, check it out: http://www.katjajung.de/blog/.


javascript img src

February 11, 2008

People keep searching for “javascript img src” … maybe this is what they’re looking for:

How to create an image DOM element dynamically and place it on the page

There are many ways of doing this, here’s one in a code snippet:

<html>
<body onload="LoadImage();">

<div id="ImageContainer">
An image will go here:<br />
</div>

<script type="text/javascript">
//<![CDATA[

function LoadImage() {
	// get a reference to the DIV into which we want to put the image [1]
	var container = document.getElementById("ImageContainer");

	// create a new image element [2]
	var anImage = document.createElement("IMG");

	// when the image is loaded, attach this image element to the container DIV
	// (but not earlier) [3]
	anImage.onload = function() {
	 	container.appendChild(this);
	}

	// set the image to load [4]
	anImage.src = "http://www.google.com/intl/en_ALL/images/logo.gif";
}

//]]>
</script>

<body>
</html>

What’s going on:

  • When the page is done loading, a JavaScript function to load the image is called.
  • [1] This functions gets an object reference to the DOM node of the DIV container to hold the image when it’s loaded.
  • [2] It then creates a new Image element DOM node, that is not yet part of the page’s node tree (and therefore does not get displayed).
  • [3] The onload handler for that node is created as a function that simply attaches the loaded image to the container. Note that this is essentially asynchronous programming: This function does not get executed until the image has been completely loaded (which may be never!). This function also uses what is called a closure over the variable context as it was defined when the function was created, meaning that all the variables outside of it’s function body are available inside, for example container.
  • [4] (Almost) as soon as an image URL is assigned to the image, the browser begins loading it.

The function defined in [3] will always execute after [4].


Detect IE6

February 11, 2008

Since people keep looking for this, here is how you can identify IE 6 (and IE 6 only):

<script type="text/javascript">
//<![CDATA[
var is_ie6 = (
	window.external &&
	typeof window.XMLHttpRequest == "undefined"
);
//]]>
</script>

This is reliable and even Opera is not pretending to be IE6 with this check. This is mostly useful to determine whether the PNG/Alpha Image Loader hack needs to be applied.

UPDATE:

The code provided by Lea Verou below performs up to 50% faster on all browsers except Firefox 2 and Google Chrome (where this test is really slow for some reason; maybe the in operator has not been optimized?)

<script type="text/javascript">
//<![CDATA[
var is_ie6 = ('ActiveXObject' in window && !('XMLHttpRequest' in window));
//]]>
</script>

I cannot wait for the day when we do not have to deal with IE6 anymore. Considering how many intranet applications exist that specifically target this browser, this day may be in a distant future, unfortunately.


Real Productivity & How Not Having ReSharper Really Hurts

February 8, 2008

Synopsis: ReSharper is a productivity-improving tool for Visual Studio. NOT having it really hurts after getting used to it.

As part of our team evaluation, I have been running ReSharper for the last month. Even though it improved my daily coding routine, I was not able to quantify whether it would be worth the money to buy a license to run it on Visual Studio for me and/or others in the team.

Then we decided to move to Visual Studio 2008 a few days ago. The migration worked fine for everyone except me. I was getting a really weird exception from ReSharper when trying to edit comments. There was also a problem with the key mappings. The only option for me to continue work properly was to uninstall ReSharper.

Now I know how much it really hurts my productivity to not to have this tool.

ReSharper isn’t perfect, by all means. Besides the bug described, it does not execute the NUnit 2.4-style GlobalSetup steps in unit tests, which means I cannot use their beautifully integrated unit testing tools. This has been a known shortcoming for over a year and the fact that JetBrains is not in a rush to fix this worries me. Sometimes I also get stuck in the parameter list popup, which hijacks my cursor keys and cannot be closed with ESC–but maybe that’s just a user error.

But there’s a lot to like: The code quality analyzer running on an open file is very helpful. I made it part of my coding quality process to only commit files that pass the analysis and have the green little square in the upper right corner. The refactoring support is decent (certainly better then what comes with Visual Studio).

I cannot wait for them to fix the comment bug so I can continue using it. There are not any real alternatives to Visual Studio when it comes to .Net and C# development. I would guesstimate that my productivity is down 20% without it (which does not include long-term effects from having lower-quality code because of my oversights that ReSharper’s analysis process would have caught).

The Eclipse Difference

I should note that if I were developing in Java, I would not have any of these problems. Eclipse does most of what the Visual Studio/ReSharper combo does, some of it better. And it’s completely free. Alas, we’re not using Java for this project and–quite frankly–I am currently more fond of C#/.Net for various reasons …


Sooner or later, I will host this blog myself …

February 5, 2008

… and am making this mental note here for my own purposes: WordPress Configuration