The Real Basics of Programming in Java (for non-programmers)

by Steven J. Owens (unless otherwise attributed)

The Real Basics

These details get skipped a lot. If you've played a little bit with some programming, like BASIC, you may want to skip this, but unless you're confident, I suggest you at least skim it. Even if you are confident, I get into some programming in-jokes and stuff, further on, that might help ease the shock of getting into the programming world.

If you're confident, go back to The Elements of Programming In Java.

A program is a huge, complex list of step-by-step instructions that the computer carries out. A computer program is made up of words and punctuation.

A Short Example

Okay, so let's give you a short example of what a bit of program might look like:

System.out.println("Hello.") ;
int somenumber = 2 ;
int anothernumber = 3 ; 
int theothernumber ;
theothernumber = somenumber + anothernumber;
System.out.println("the other number is " + theothernumber) ;
theothernumber = theothernumber + 1 ;
System.out.println("the other number is now " + theothernumber) ;

This sort of human-readable stuff that makes up the program is usually called the source code, not an important detail but if you're curious as to why read this.

If you open up the source code for a program, you see a bunch of words, punctuation and whitespace (whitespace is spaces, tabs, or carriage returns/newlines).

The first step the computer takes in converting the human-readable words to computer instructions is breaking them down into tokens, chunking them up according to certain rules. For starts, whitespace separates the words, and a change from letters to punctuation usually separates the words.

Identifiers: Keywords, Variables, Literals

Strings of letters like "somenumber", are identifiers.

An identifier is either:

In the example, int is a keyword, while somenumber, anothernumber, and theothernumber are variable names.

Most identifiers in any large given program are usually variable names, since there're a limited number of keywords. On the other hand, those keywords get repeated a lot in any program.

Keywords are sometimes called "reserved words" (meaning that you can't use them for a variable name, because they're reserved to mean something special).

There are also rules for what you can put in a variable name, but for now just use normal words, nouns mostly, avoid using a keyword, and avoid doing anything fancy, and you'll be fine. Later on, read up on naming rules and conventions.

A literal means that instead of having a variable that contains a value, you just literally have the value typed right there in the code, like the number 343243, for example. To put a literal String (programmer speak for a piece of text) in a program, put quotes around it, "like this".

Punctuation

Most punctuation in the code is one of two things, either 1) operators or 2) start/end indicators for an expression, statement or block. See Fun With Punctuation for a little more detail.

Operators

An operator is basically a bit of puncation that does something, like the plus character + adds to values together, or the minus character - subtracts them.

Most of the usual math symbols do what you'd expect them to, except for the equals sign =, which is an important exception I'll get to in just a second.

Another exception is that the plus sign + can be used to add two strings of characters together, so:

"foo" + "bar"

Works out to have the same value as "foobar". There's not much use for "foo" + "bar" in programming, but there's a lot of use for gluing together strings. It also automatically converts any non-strings (like number values) that is added to a string, so you can print it out.

Another exception is ++ and --, which are called the increment and decrement operators. These get used so often in programs that I wanted to point them out. Java doesn't use them as much as some programming languages, but they still get used a lot. They're just a shortcut for "add one" or "subtract one", which seems kind of stupid at first. After you write a few programs, you'll realize they get used a lot to count down or count up from one number to another, or to step through a bunch of things.

You should read The Java Tutorial section on operators.

The Equals Sign =, Assignment and Comparison

The equals sign is used for assignment, meaning, storing a value into a variable. If you see:

a = 5

That means "store the value 5 in variable a." When programmers write this out, they usually say something like "let a equal five" or "a is assigned the value five".

If you try to put a literal on the left side, like this:

5 = b

...the compiler yells at you. This turns out to be a very good thing, we'll talk about it in a moment.

You can also assign one variable to another, which means that the value in the second variable gets copied to the first variable as well:

a = b

Equals Signs, Assignment and Comparison, or Stupid Programmer Mistakes

Using a single equals sign for assignment is a very firmly established tradition in programming. This is sort of unfortunate, because it leads to one of the most common, and most frustrating typos in the programming world, which is "confusing assignment (=) with comparison (==)".

Most people first learn about the equals sign for comparison, like "is a equal to b?", and for finding out the value, like "two plus two equals 4." The motion sort of goes left-to-right, the same way you read english. In programming, when you see the equals sign, the motion goes the other way, from right to left - the value on the right gets stored in the variable on the left.

You use a double-equals sign == to do comparison. You write "is a equal to b?" like this:

a == b

It is very, very easy, even for an experienced programmer, to slip and write this instead:

a = b

When this does happen, it can lead to very weird behavior, which is frustratingly hard to track down, and when you finally find it, you feel really stupid.

One really good rule of thumb that I picked up somewhere is "literal on the left". If you're doing a comparison with a literal value, put the literal value on the left side. For example:

5 == a

This is good because it forces you to think in a slightly different way about it, and because the compiler will yell if you slip and write 5 = a.

Syntax

There are rules about how tokens can be used and how they can go together. These rules are called syntax. As you start to program, you'll be hearing about syntax a lot, mainly because a lot of the more common mistakes beginners make are syntax errors, usually finicky typos that are just damned hard to remember, until they become ingrained by habit. Don't get frustrated, it's not you; even experienced programmers often (usually) make stupid typos in their first draft of a piece of code. Fortunately, these days programming editors usually point the error out as soon as you type it.

expressions

An expression is the smallest piece of code that can be treated like a value. So a variable is an expression. So is a literal. An operation, like 1 + 1 is an expression. So is somenumber + anothernumber. I guess you could say, as a rule of thumb, that anything that you could take out and replace with a literal value, is an expression.

Another way to say that is, "anything that returns a value". "Returns a value" is programmer-speak for "results in a value being sent back". There's a return keyword that does this from a method (which is basically a grouping of code, also sometimes called procedures, subroutines, or functions). So a call to a method that returns a value is also an expression.

statements

If an expression is a term, a statement is a phrase. Since it's a phrase, not a sentence, you don't put a period at the end of it, you put a semi-colon ; at the end of it.

For example: int somenumber = 2 + 2 ;

Source Code and Fun With Punctuation

Most punctuation is a single character, so there's nothing complicated as to what separates different punctuation tokens. There are some two-character combinations and there are some matched sets. The matched sets are usually used for organizing things, to start and end sections. Usually a left parentheses "(" has to have a right parentheses ")" somewhere, same for { and }, and [ and ]. Quotes like " and ' and ` usually have to be in a matched set, like "foo", or 'bar', or `baz`. You will almost never see the < and > used like this, because they're usually used for greater-than and less-than.

Note: The ` is usually on the key to the left of the number 1 key, upper-left corner of your keyboard, while ' is usually next to the enter key, middle-right side of your keyboard. Some people like to use ` and ' as opposite sets of a matched pair, like ( and ), but very few programming languages do this.

Multitasking

When you get right down to it, computers can only do one thing at a time. They can just do them so fast it looks like they're doing several things at once. This is called "multitasking." Juggling is a popular metaphor for this, but I prefer the chess master metaphor.

The computer acts sort of like a brilliant chess player who can play against twenty ordinary people at once - make a move on this chess board, move to the next chess board and make a move, move to the next chess board, etc. It may look like the chess player is doing twenty things at once, but really he's just walking up to a chess board, studying the context - the layout of the chess pieces - in a glance and deciding on the right move.

Source Code: Compiled and Interpreted Programs

The human-readable version of the program is called the source code. Why this is, you don't really have to know, nor do you really have to know what compiled or interpreted mean. But if you're interested, read on.

The computer itself only understands numbers. Everything that you see a computer do that looks like it deals with words, is in the end boiled down to numbers. The program is too - the human-readable version is purely for your convenience. And believe me, it is one heck of a convenience.

Something has to convert that human-readable stuff to machine code (pure 1s and 0s). This is called a compiler, or sometimes an interpreter. A compiler converts just once. The result is a set of 1s and 0s that the machine runs. An interpreter converts on the fly; it's a program that runs your program, interpreting your words and issuing the right 1s-and-0s version of the commands as necessary.

But in real life, nothing is that simple.

First of all, there's a lot of give and take in terms of whether a particular program is a compiler, per se, or an interpreter. Many "interpreters" actually compile the program right when you run the program. Some compilers (like the java compiler) compile the program halfway, to a generic format called bytecode, and then it's run on a virtual machine (like the Java Virtual Machine, JVM for short), which is basically another take on the interpreter, a program that runs a program.Second of all, there is seldom a one-for-one word-to-code translation. Most often, a short set of words converts into dozens or more machine codes. On top of that, many compilers and interpreters then rearrange commands to make it all run faster, while still doing exactly what you said to do.

RTFM, "Use The Source, Luke", and "ask smart questions"

As you learn and ask people for help, you'll hear two phrases very often, RTFM (Read The Fucking Manual) and "Use the Source, Luke" (sometimes abbreviated UTSL). A growing trend is the third phrase, "learn how to ask smart questions", after an essay written by Eric S. Raymond about how to do just that.

All of these boil down to "do your own homework before you ask me for help." Like reading this page. All of us have been there before, and it sucks, we'll spend tons of energy to help you figure out how to do it yourself, but we're not going to do it for you.

Foo, Bar, Baz: Metasyntactic Variables and Stupid Programmer Jokes

A lot of programmers use the words foo, bar, and baz a lot in examples. This is so common that there's actually now a jargon phrase for it, the metasyntactic variables.

The reason these show up so often is that often you need to explain something, and you need to explain it with an example, and the example has to refer to some things, purely for the purpose of the example. So programmers use "foo" and "bar" (from the military slang FUBAR, for "Fucked Up Beyond All Recognition). The third variable "baz", is just a corruption of "bar", because you often need a third example variable.

Why not use real words? Three reasons:


Last modified: Fri Feb 27 03:22:19 EST 2004
See original (unformatted) article

Feedback

Verification Image:
Subject:
Your Email Address:
Confirm Address:
Please Post:
Copyright: By checking the "Please Post" checkbox you agree to having your feedback posted on notablog if the administrator decides it is appropriate content, and grant compilation copyright rights to the administrator.
Message Content: