Chapter 9 - The Tower of Babel

(9.1) - Why Babel?

C++ is a powerful general purpose language that has been used to write a large number of programs to perform a great many different tasks. But C++ isn't the only high level programming language. Why aren't all programs written in C++? Are there some things that C++ can't do? If so, then why aren't all programs written in some other language that can do everything?

C++, like a number of other languages, is powerful enough so one can use it to code any task. But in a lot of cases, it's more convenient to code a task in a language that's specifically designed for that task. The situation is analogous to the automobile world: why doesn't everybody drive the same kind of automobile? Because off-road driving is best handled by a four-wheel-drive vehicle, hauling a team of Little Leaguers is best handled in a minivan, etc.

(9.2) - Procedural Languages

A procedural language or imperative language has statements manipulating data items in the computer's memory. The programmer directs, via program instructions, every change in the values stored in the memory locations containing data.

(9.2.1) - FORTRAN

FORTRAN (derived from FORmula TRANslation) was developed in the fifties for applications with a heavy mathematics or computational flavor. Besides the usual operators for addition, subtraction, multiplication, and division (+, -, *, and /), FORTRAN also has an exponentiation operator (**).

In the first version of FORTRAN, statements could be assigned numbers (instead of labels) and changes in control flow were performed by the IF and GO TO statements.

    IF(NUMBER) 20, 30, 40
would send control to statement 20 if NUMBER < 0, to statement 30 if NUMBER = 0, or to statement 40 if NUMBER > 0.
      GO TO 50
would always send control to statement 50. A while-loop in C++ like:
    while(Number >= 0)
    {
        .
        .
        .
        cin >> Number ;
    }
was coded in FORTRAN like:
10    IF(NUMBER) 20, 5, 5
5     .
      .
      .
      READ(*,*) NUMBER
      GO TO 10
20    ...
Statement numbers could be in any order so following the flow of control could be a nightmare: "spaghetti code" was a common term applied to many FORTRAN programs.

Later versions of FORTRAN (FORTRAN II, FORTRAN IV, FORTRAN 77, Fortran 90, and High Performance Fortran) have incorporated more data types and new statements to direct the flow of control.

(9.2.2) - COBOL

COBOL (derived from COmmon Business-Oriented Language) was developed in 1959-1960 by a group headed by Grace Hopper (1906-1992) of the U.S. Navy. It was designed to serve business needs such as updating master files with changes from transaction files and producing summary reports. Instead of mathematical notation like:
    sum = a + b;
a COBOL programmer would write:
    ADD A TO B GIVING SUM.
Back in the fifties and sixties, 80-column punched cards were used for many databases. To conserve card columns, COBOL used only six digits for a date (two for the month, two for the day of the month, and two for the year): some now blame the Y2K problem on COBOL.

(9.2.3) - C

Dennis Ritchie at AT&T Labs developed C in 1959-1960 so that the UNIX operating system and other system programs could be written in a high-order language. Since then C became a very popular language for general-purpose programming - mainly because UNIX became a very popular operating system.

Like other high-order languages a C program doesn't need to specify the locations of variables in memory, but this is necessary in some system programs so C also has some low-level constructs.

For example, if Number is a variable in a C program then &Number refers to the memory address of Number. If Number is stored in memory location 1000 and has a value of 234 then Number is 234 and &Number is 1000.

IdentifierAddressValue
 
 
Number1000234
 
 

Another low-level construct in C is the pointer data type for variables that contain memory addresses. For example,

    int* intPointer ;
declares intPointer to be a variable that contains the memory address of an integer and the assignment statement:
    intPointer = (int*) 800 ;
sets intPointer to point to the memory address 800 as shown in Fig. 9.3(a) - intPointer itself is assigned some memory location by the C compiler.

The memory location that intPointer points to is identified by *intPointer so the assignment statement:

    *intPointer = 3 ;
sets the value stored in memory location 800 to 3 as shown in Fig. 9.3(b).

These low-level constructs were put in C so system programs like UNIX can read/write certain specific memory locations used by I/O devices. They are also a source of bugs in non-system programs - for example, location 800 in the C code above may already have some other purpose in this program or in some other program (like the operating system.)

The C++ language we studied in Chapter 8 is a superset of C that was also developed at AT&T Labs - a C++ programmer doesn't need to use the pointer data type and be concerned with all the bugs it might create.

(9.2.4) - Ada

The U.S. Department of Defense has long had a serious problem maintaining all the software that comes with the systems they buy since it was written in such a wide variety of different languages. In the seventies Ada was developed to alleviate this problem: all software embedded in the hardware delivered to the armed services must now be written in Ada.

(9.2.5) - Java

Java is an object-oriented language like C++ that was developed, along with a number of interesting ideas like GUIs, at Sun Microsystems, Inc. in 1991-2. The main feature of Java is portability - one can write a program in Java and easily port it to a wide variety of different platforms. There are two kinds of Java programs:

Portability is achieved by compiling the source program into a low-level language called Java bytecode which is easily translated into the machine language of any computer.

(9.2.6) - C# and .NET

C# (C-sharp) was introduced in June 2000 by Microsoft as a safer extension of C than C++. As an example, a C++ program must specifically release any dynamic storage it was allocated when it is done with it - C# handles such garbage collection automatically.

The Microsoft .NET Framework is a very large collection of tools for software development in C# and other programming languages.

(9.3) - Special-Purpose Languages

Besides the general-purpose languages mentioned earlier in this chapter there are also a large number of high languages for specific purposes. Three of these special-purpose languages are described here.

(9.3.1) - SQL

SQL (standing for Structured Query Language) allows users to pose questions to a database. As an example:
    SELECT NAME
    FROM VENDOR
    WHERE ZIP = 95082;
is an SQL statement asking a database for the names of all vendors in the 95082 Zip-code area.

(9.3.2) - HTML

HTML (standing for HyperText Markup Language) is the language used to create HTML documents which can be viewed with a Web browser. An HTML document contains the text to be displayed plus markup tags to create special effects and references to other Web pages. These course notes are written in HTML: to get an idea of what HTML looks like, you can use a menu item in your browser to display the source language of this page.

(9.3.3) - JavaScript

JavaScript is a language that is interpreted by a Web browser to make those pages active rather than static. For example, JavaScript can be used to make a Web page into a form in which a user can enter data to be returned to the Web server.

(9.4) - Alternative Programming Paradigms

A paradigm is a framework or model for thinking about something. The paradigm for a procedural language (whether it is object-oriented or not) is to think of the instructions of a program as accessing and modifying the contents of memory locations. Some other paradigms are described in this section.

(9.4.1) - Functional Programming

In mathematics, a function receives the values of one or more arguments and combines them in some way to produce a single value - the value produced depends only on the values of the arguments and nothing else. A mathematical function transforms the values of its arguments into a corresponding resulting value.

Any computer program transforms the values of its input data into the values of its outputs. If it produces more than one output, one can group them into a single list containing all the output values. Thus, one can think of any computer program as a mathematical function transforming the values of its arguments (input data) into a corresponding resulting value (a list of all output values.)

In a functional programming language (like Scheme) certain primitive functions are defined as part of the language and the programmer can use these primitives to build other functions.

For example, the tripling function, f (x ) = 3x transforms its argument into a result with three times the value so f (4) = 12 and f (12) = 36. In Scheme one can define this function and give it the name triple with:

    (define (triple x)
        (* 3 x))
One can then use the triple function by entering something like (triple 4) and Scheme will display the answer, 12.

As another example, the squaring function, g (x ) = x2 can be defined in Scheme and given the name square with:

    (define (square x)
        (* x x))
Once a function is defined one can use it to define other functions. For example, h (x ) = 3x2 can be defined in Scheme and given the name foo with:
    (define (foo x)
        (triple (square x)))
If a user enters (foo 4) then Scheme displays 48.

Scheme has four primitives that manipulate lists:

As an example, these primitives are used to define a function called adder that takes a list of numbers as its argument and produces the sum of all numbers in the list. It uses the fact that: to sum up all the numbers in a list one can add the first number in the list to the sum of all the other numbers in the list.
    (define (adder input-list)
        (cond ((null? input-list) 0)
           (else (+ (car input-list) 
               (adder (cdr input-list))))))
The definition of adder uses a conditional construct in Scheme: Note that when input-list is nonempty then adder invokes itself with (cdr input-list) as the argument. We call such a function a recursive function. Many problems have simple recursive solutions. Most procedural languages also support recursion but it's better to write a recursive algorithm in a functional language because the possibility of undue side effects is eliminated. A recursive algorithm written in a procedural language stores intermediate values in memory and a side effect occurs if the code inadvertently changes some value it shouldn't be changing.

A program written in a procedural language specifies the exact order in which its steps are performed: a program written in a functional language doesn't specify this order. The only restriction on the order of function evaluations is that a function can't be evaluated until all its arguments have been evaluated.

(9.4.2) - Logic Programming

In logic programming, various facts are asserted to be true and a logic program can infer or deduce other facts in response to queries. The best-known logic programming language is Prolog (standing for PROgramming in LOGic). As an example, we show a Prolog program whose domain of interest is American history. Each of the following facts shows who was the U.S. president when a certain event occurred:
    president(lincoln, gettysburg_address) .
    president(lincoln, civil_war) .
    president(nixon, first_moon_landing) .
    president(jefferson, lewis_and_clark) .
    president(kennedy, cuban_missile_crisis) .
    president(fdr, world_war_II) .
The following facts show the chronology of the terms of office of these presidents:
    before(jefferson, lincoln) .
    before(lincoln, fdr) .
    before(fdr, kennedy) .
    before(kennedy, nixon) .
If given the query:
?-before(lincoln, fdr) .
then Prolog responds with a Yes because that is a fact in the program. The following query:
?-president(lincoln,civil_war),before(lincoln,fdr)
asks if Lincoln was president during the civil war AND if Lincoln was before FDR. Since both facts are in the program, Prolog responds with a Yes. One can also use variables (beginning with capital letters) in a query. For example, the query:
    ?-president(lincoln,X).
is answered with the following responses:
    X = gettysburg_address
    X = civil_war
There is a problem with the before relation. For example, Lincoln was president before Kennedy but the query: ?-before(lincoln,kennedy) will get a No response. To correct this problem one can define another relation, precedes:
precedes(X,Y) :- before(X,Y) .
precedes(X,Y) :- before(X,Z),precedes(Z,Y) .
The first line of this definition says that X precedes Y if X comes before Y. The second line says that X precedes Y if X comes before some third president, Z, who precedes Y. With this definition, the query: ?-precedes(lincoln,kennedy) will now get a Yes response.

One can now define another relation to show the time order of events, earlier:

earlier(X,Y) :- president(R,X),president(S,Y),precedes(R,S) .
The query: ?-earlier(world_war_II,X) will now get the following responses:
    X = first_moon_landing
    X = cuban_missile_crisis
Prolog is a declarative language instead of an imperative language because a Prolog program has no commands to perform any operations. As shown in Fig. 9.11, there is an Inference engine inside the Prolog interpreter or compiler which reads each query and determines how best to answer it.

(9.4.3) - Parallel Programming

A number of important problems are grand challenges that can't be solved in any reasonable time by a single processor: a parallel processor with a large number of processing elements working together on such a problem should be able to solve it much faster. The parallel processors that have been built come in two different flavors: S I M D (pronounced sim-dee) and M I M D (pronounced mim-dee): Note that every algorithm we studied in this course performed its steps sequentially (one at a time.) To use a parallel processor efficiently one needs a parallel algorithm whose steps can be run in parallel. Some algorithms like the Sequential Search algorithm in subsection 2.3.2 are easily parallelized.
Kenneth E. Batcher - 10/31/2006