Note: This file has moved to notablog.

Tracey Moore writes: > My company has recently completed a visual basic/java application > and we now need to document the code. (All we've done so far is print > out the code itself.) Does anyone out there have experience with this? Depends, of course, on what "document the code" means. Peter, Jim and Bruce gave excellent advice. Here's a bit more perspective, assuming you're talking about documenting the source to help a programmer later modify, maintain, or analyze the code. The program consists of one or more object classes, each of which has one or more methods that interact with each other and other objects. You should document at four levels: The first level, the overview, should describe each object, what it does, and which of the other objects it interacts with, and how. The second level should describe each object in terms of the methods and the object variables, and what purposes they serve, and which methods call other methods. This section should look at the code more as a whole, not in the components. It should document the flow of execution and the relationships (method calls) among the objects, not simply list each object and the methods and variables associated with it. For example: First level description of the XYZZY thesaurus server: xyzzy - the main object, starts the program, instantiates configuration (which instantiates thesaurus), instantiates a pool of portlisteners, which listen at the configured port for a request and then look up synonyms in the thesaurus object. configuration - loads the configuration file, parses it, sets all of the configuration variables, and instantiates the thesaurus. thesaurus - the thesaurus data, a static object globally available to the rest of the program, instantiated by the configuration object. portlistener - instantiated in a pool by xyzzy, takes turns listening at the port to serve the next request. Second level description: xyzzy.java is the main object; execution starts when you load the class and the main() method is invoked. xyzzy.main() method calls the xyzzy.load_config() method, which instantiates configuration.java (invoking configuration.init()), and saves the returned values as object variables. configuration.init() loads in the config file, sets all of the configuration options, including file locations, how many port listener objects to maintain in the pool, port to listen on, permissions, etc, see configuration file or configuration.java source for details on which options are configurable. Then init() invokes configuration.load_thesaurus(). configuration.load_thesaurus() instantiates the thesaurus object as a static class variable, passing the thesaurus data file in from the configuration file. configuration.init() returns references to the the configuration options and the thesaurus object to xyzzy.main(). xyxxy.main() instantiates a pool of portlistener objects and invokes portlistener.listen() with the port number as an argument on the first portlistener. portlistener.listen() listens on a port until it gets a request, whereupon it relinquishes the port back to xyzzy.main() to assign a new portlistener, and invokes portlistener.do_request() to asnwer the request. portlistener.dorequest() invokes thesaurus.lookup(), passing the arguments of the request... If you produce a printed form of this, it helps to use some sort of visual cue (indentation, boxes, colors, whatever) to delineate the methods that are internal to an object. In code source, indentation usually piles up too quickly and you end up indenting 40 spaces, which just gets too unwieldly. Note: The configuration file mentioned above should include copious comments before each configuration option, explaining what it means, what the various options are, and recommending a course of action for the user. The "default" configuration should work, if there are things that can't be left to the default, the installation program should figure them out or explain the question to the user and get the data and set the configuration. The third level should describe each method in general terms, the general structure of the method, which other methods and variables it calls, and even which methods it expects to be called from (this last is most likely to get changed over time, but at least it will give the reader an idea of what the method was designed for). The arguments to the method should be described, specifically what they *mean*, where they're probably set elsewhere in the program, and what the method will do to them or use them for. The fourth level is comments in the source code of the objects and methods themselves. Focus on why something is being done, and not just what is being done. "Decrement the counter" as a comment isn't terribly useful. Knowing where the counter is from, where it was set, what it's counting, and why it needs to be decremented would be better. Part of the "documentation" at the fourth level is simply code style for legibility & comprehensibility. A good indentation style, grouping things so they can be easily understood. Put the accessor methods at the beginning of the object. Put the mutator methods next. Then probably the handful of methods that do most of the work, and are called by the accessors and mutators. Then specialized methods (usually rather short and singular of purpose, and only called from the handul of methods that do most of the work). Name methods and variables appropriately so it's easy, having read the read of the code, to guess what a method is doing. Also name them so they make it explicit what they are. For example, if a variable is a static configuration variable then name it so it reflects that, and unless they're commonly used throughout the program, include a comment when you use it, so the reader can understand what method initializes the variable, and also note where that method was invoked from. There is even a school of thought, in programming, that says code shouldn't *need* comments; it should be written to clearly communicate the intent, and a need for a comment is an indicator that the code itself needs further clarification. I think this is going a bit far... The documentation should be incorporated into the source code, so it will always be where the code is. The compiler will remove it when it's compiled. The third level stuff should be at the beginning of each method. The second level stuff should be at the beginning of each object. The first level stuff should probably be at the beginning of the main object. All of the levels will overlap to a degree. (Java's Javadoc standard is designed to do this). Changes should be noted in comments at the point the code is changed, including a date/time stamp, and in a master list of changes at the beginning of the main object (with pointers to the location