Computer Science

Computer Science

by Geneva G. Belford

Computer science, the study of computers, including their design (architecture) and their uses for computations, data processing, and systems control. The field of computer science includes engineering activities such as the design of computers and of the hardware and software that make up computer systems. It also encompasses theoretical, mathematical activities, such as the design and analysis of algorithms, performance studies of systems and their components by means of techniques like queueing theory, and the estimation of the reliability and availability of systems by probabilistic techniques. Since computer systems are often too large and complicated to allow a designer to predict failure or success without testing, experimentation is incorporated into the development cycle. Computer science is generally considered a discipline separate from computer engineering, although the two disciplines overlap extensively in the area of computer architecture, which is the design and study of computer systems.
The major subdisciplines of computer science have traditionally been (1) architecture (including all levels of hardware design, as well as the integration of hardware and software components to form computer systems), (2) software (the programs, or sets of instructions, that tell a computer how to carry out tasks), here subdivided into software engineering, programming languages, operating systems, information systems and databases, artificial intelligence, and computer graphics, and (3) theory, which includes computational methods and numerical analysis on the one hand and data structures and algorithms on the other.
Development of computer science
Computer science as an independent discipline dates to only about 1960, although the electronic digital computer that is the object of its study was invented some two decades earlier. The roots of computer science lie primarily in the related fields of electrical engineering and mathematics. Electrical engineering provides the basics of circuit design—namely, the idea that electrical impulses input to a circuit can be combined to produce arbitrary outputs. The invention of the transistor and the miniaturization of circuits, along with the invention of electronic, magnetic, and optical media for the storage of information, resulted from advances in electrical engineering and physics. Mathematics is the source of one of the key concepts in the development of the computer—the idea that all information can be represented as sequences of zeros and ones. In the binary number system, numbers are represented by a sequence of the binary digits 0 and 1 in the same way that numbers in the familiar decimal system are represented using the digits 0 through 9. The relative ease with which two states (e.g., high and low voltage) can be realized in electrical and electronic devices led naturally to the binary digit, or bit, becoming the basic unit of data storage and transmission in a computer system.
The Boolean algebra developed in the 19th century supplied a formalism for designing a circuit with binary input values of 0s and 1s (false or true, respectively, in the terminology of logic) to yield any desired combination of 0s and 1s as output. Theoretical work on computability, which began in the 1930s, provided the needed extension to the design of whole machines; a milestone was the 1936 specification of the conceptual Turing machine (a theoretical device that manipulates an infinite string of 0s and 1s) by the British mathematician Alan Turing and his proof of the model’s computational power. Another breakthrough was the concept of the stored-program computer, usually credited to the Hungarian-American mathematician John von Neumann. This idea—that instructions as well as data should be stored in the computer’s memory for fast access and execution—was critical to the development of the modern computer. Previous thinking was limited to the calculator approach, in which instructions are entered one at a time.
The needs of users and their applications provided the main driving force in the early days of computer science, as they still do to a great extent today. The difficulty of writing programs in the machine language of 0s and 1s led first to the development of assembly language, which allows programmers to use mnemonics for instructions (e.g., ADD) and symbols for variables (e.g., X). Such programs are then translated by a program known as an assembler into the binary encoding used by the computer. Other pieces of system software known as linking loaders combine pieces of assembled code and load them into the machine’s main memory unit, where they are then ready for execution. The concept of linking separate pieces of code was important, since it allowed “libraries” of programs to be built up to carry out common tasks—a first step toward the increasingly emphasized notion of software reuse.
Assembly language was found to be sufficiently inconvenient that higher-level languages (closer to natural languages) were invented in the 1950s for easier, faster programming; along with them came the need for compilers, programs that translate high-level language programs into machine code. As programming languages became more powerful and abstract, building efficient compilers that create high-quality code in terms of execution speed and storage consumption became an interesting computer science problem in itself.
Increasing use of computers in the early 1960s provided the impetus for the development of operating systems, which consist of system-resident software that automatically handles input and output and the execution of jobs. The historical development of operating systems is summarized below under that topic. Throughout the history of computers, the machines have been utilized in two major applications: (1) computational support of scientific and engineering disciplines and (2) data processing for business needs. The demand for better computational techniques led to a resurgence of interest in numerical methods and their analysis, an area of mathematics that can be traced to the methods devised several centuries ago by physicists for the hand computations they made to validate their theories.
Improved methods of computation had the obvious potential to revolutionize how business is conducted, and in pursuit of these business applications new information systems were developed in the 1950s that consisted of files of records stored on magnetic tape. The invention of magnetic-disk storage, which allows rapid access to an arbitrary record on the disk, led not only to more cleverly designed file systems but also, in the 1960s and ’70s, to the concept of the database and the development of the sophisticated database management systems now commonly in use. Data structures, and the development of optimal algorithms for inserting, deleting, and locating data, have constituted major areas of theoretical computer science since its beginnings because of the heavy use of such structures by virtually all computer software—notably compilers, operating systems, and file systems. Another goal of computer science is the creation of machines capable of carrying out tasks that are typically thought of as requiring human intelligence. Artificial intelligence, as this goal is known, actually predates the first electronic computers in the 1940s, although the term was not coined until 1956.
Computer graphics was introduced in the early 1950s with the display of data or crude images on paper plots and cathode-ray tube (CRT) screens. Expensive hardware and the limited availability of software kept the field from growing until the early 1980s, when the computer memory required for bit-map graphics became affordable. (A bit map is a binary representation in main memory of the rectangular array of points [pixels, or picture elements] on the screen. Because the first bit-map displays used one binary bit per pixel, they were capable of displaying only one of two colours, commonly black and green or black and amber. Later computers, with more memory, assigned more binary bits per pixel to obtain more colours.) Bit-map technology, together with high-resolution display screens and the development of graphics standards that make software less machine-dependent, has led to the explosive growth of the field. Software engineering arose as a distinct area of study in the late 1970s as part of an attempt to introduce discipline and structure into the software design and development process. For a thorough discussion of the development of computing, see computers, history of.
Architecture deals with both the design of computer components (hardware) and the creation of operating systems (software) to control the computer. Although designing and building computers is often considered the province of computer engineering, in practice there exists considerable overlap with computer science.
Basic computer components
A digital computer (see also analog computer) typically consists of a control unit, an arithmetic-logic unit, a memory unit, and input/output units, as illustrated in the figure. The arithmetic-logic unit (ALU) performs simple addition, subtraction, multiplication, division, and logic operations—such as OR and AND. The main computer memory, usually high-speed random-access memory (RAM), stores instructions and data. The control unit fetches data and instructions from memory and effects the operations of the ALU. The control unit and ALU usually are referred to as a processor, or central processing unit (CPU). The operational speed of the CPU primarily determines the speed of the computer as a whole. The basic operation of the CPU is analogous to a computation carried out by a person using an arithmetic calculator, as illustrated in the figure. The control unit corresponds to the human brain and the memory to a notebook that stores the program, initial data, and intermediate and final computational results. In the case of an electronic computer, the CPU and fast memories are realized with transistor circuits.
I/O units, or devices, are commonly referred to as computer peripherals and consist of input units (such as keyboards and optical scanners) for feeding instructions and data into the computer and output units (such as printers and monitors) for displaying results.
In addition to RAM, a computer usually contains some slower, but larger and permanent, secondary memory storage. Almost all computers contain a magnetic storage device known as a hard disk, as well as a disk drive to read from or write to removable magnetic media known as floppy disks. Various optical and magnetic-optical hybrid removable storage media are also quite common, such as CD-ROMs (compact disc read-only memory) and DVD-ROMs (digital video [or versatile] disc read-only memory).
Computers also often contain a cache—a small, extremely fast (compared to RAM) memory unit that can be used to store information that will be urgently or frequently needed. Current research includes cache design and algorithms that can predict what data is likely to be needed next and preload it into the cache for improved performance.
Basic computer operation
The operation of such a computer, once a program and some data have been loaded into RAM, is as follows. The first instruction is transferred from RAM into the control unit and interpreted by the hardware circuitry. For instance, suppose that the instruction is a string of bits that is the code for LOAD 10. This instruction loads the contents of memory location 10 into the ALU. The next instruction, say ADD 15, is fetched. The control unit then loads the contents of memory location 15 into the ALU and adds it to the number already there. Finally, the instruction STORE 20 would store the sum in location 20. At this level the operation of a computer is not much different from that of a pocket calculator. In general, of course, programs are not just lengthy sequences of LOAD, STORE, and arithmetic operations. Most importantly, computer languages include conditional instructions, essentially rules that say, “If memory location n satisfies condition a, do instruction number x next, otherwise do instruction y.” This allows the course of a program to be determined by the results of previous operations—a critically important ability.
Logic design and integrated circuits
Logic design is the area of computer science that deals with the design of electronic circuits to carry out the operations of the control unit, the ALU, the I/O controllers, and more. For example, the addition circuit of the ALU has inputs corresponding to all the bits of the two numbers to be added and outputs corresponding to the bits of the sum. The arrangement of wires and transistors that link inputs to outputs is determined by logic-design principles. The design of the control unit provides the circuits that interpret instructions and control subsequent behaviour. Clearly, it is critical that this circuitry be as efficient as possible; logic design deals with optimizing the circuitry, not just putting together something that will work. Boolean algebra is the mathematical tool used for logic design.
An important area related to architecture is the design of computer chips, or microprocessors, a type of integrated circuit. A microprocessor is a complete CPU—control unit, ALU, and possibly some memory (especially cache)—on a single integrated circuit chip. Additional memory and I/O control circuitry are linked to this chip to form a complete computer. These thumbnail-sized devices contain thousands or millions of transistors, together with wiring, to form the processing and memory units of modern computers.
The process of very-large-scale integrated (VLSI) circuit design involves a number of stages, which characteristically are as follows: (1) creating the initial functional or behavioral specification, (2) encoding this specification into a hardware description language, (3) breaking down the design into modules and generating sizes and shapes for the eventual chip components, and (4) chip planning, which includes building a “floor plan” to indicate where on the chip the components are to be placed and how they are to be interconnected. The modularization, sizing, and planning stages are often iterated before a final design is reached. The final stage is the formulation of the instructions for the automated production of the chip through an optical lithography process. Computer scientists are involved not only in creating the computer-aided design (CAD) tools to support engineers in the various stages of chip design but also in providing the necessary theoretical results, such as how to efficiently design a floor plan with near-minimal area that satisfies the given constraints.
Advances in integrated-circuit technology have been incredible. For example, in 1971 the first microprocessor chip (Intel Corporation’s 4004) had only 2,300 transistors, in 1993 Intel’s Pentium chip had more than 3 million transistors, and by 1997 the number of transistors on such a chip was about 20 million. A new chip design by International Business Machines Corporation (IBM), the Power4, containing approximately 170 million transistors, is scheduled to be introduced in 2001. Meanwhile, memory chips reached a billion transistors per chip before 1999.
As the growth of the personal computer industry in the 1980s and ’90s fueled research into ever more powerful processors at ever lower costs, microprocessors became ubiquitous—controlling automated assembly lines, traffic signal systems, and retail inventory systems, to name a few applications, and being embedded in many consumer products, such as automobile fuel-injection systems, kitchen appliances, audio systems, cell phones, and electronic games. See the section Impact of computer systems.
Linking processors
Multiprocessor Design
Creating a multiprocessor from a number of uniprocessors (one CPU) requires physical links and a mechanism for communication among the processors so that they may operate in parallel. Tightly coupled multiprocessors share memory and hence may communicate by storing information in memory accessible by all processors. Loosely coupled multiprocessors, including computer networks (see the section Network protocols), communicate by sending messages to each other across the physical links. Computer scientists investigate various aspects of such multiprocessor architectures. For example, the possible geometric configurations in which hundreds or even thousands of processors may be linked together are examined to find the geometry that best supports computations. A much studied topology is the hypercube, in which each processor is connected directly to some fixed number of neighbours: two for the two-dimensional square, three for the three-dimensional cube, and similarly for the higher dimensional hypercubes. Computer scientists also investigate methods for carrying out computations on such multiprocessor machines—e.g., algorithms to make optimal use of the architecture, measures to avoid conflicts as data and instructions are transmitted among processors, and so forth. The machine-resident software that makes possible the use of a particular machine, in particular its operating system (see below Operating systems), is in many ways an integral part of its architecture.
Network Protocols
Another important architectural area is the computer communications network, in which computers are linked together via computer cables, infrared light signals, or low-power radiowave transmissions over short distances to form local area networks (LANs) or via telephone lines, television cables, or satellite links to form wide-area networks (WANs). By the 1990s, the Internet, a network of networks, made it feasible for nearly all computers in the world to communicate. Linking computers physically is easy; the challenge for computer scientists has been the development of protocols—standardized rules for the format and exchange of messages—to allow processes running on host computers to interpret the signals they receive and to engage in meaningful “conversations” in order to accomplish tasks on behalf of users. Network protocols also include flow control, which keeps a data sender from swamping a receiver with messages it has no time to process or space to store, and error control, which involves error detection and automatic resending of messages to compensate for errors in transmission. For some of the technical details of error detection and error correction, see the article information theory.
The standardization of protocols has been an international effort for many years. Since it would otherwise be impossible for different kinds of machines running diverse operating systems to communicate with one another, the key concern has been that system components (computers) be “open”—i.e., open for communication with other open components. This terminology comes from the open systems interconnection (OSI) communication standards, established by the International Organization for Standardization. The OSI reference model specifies protocol standards in seven “layers,” as shown in the figure. The layering provides a modularization of the protocols and hence of their implementations. Each layer is defined by the functions it relies upon from the next lower level and by the services it provides to the layer above it. At the lowest level, the physical layer, rules for the transport of bits across a physical link are defined. Next, the data-link layer handles standard-size “packets” of data bits and adds reliability in the form of error detection and flow control. Network and transport layers (often combined in implementations) break up messages into the standard-size packets and route them to their destinations. The session layer supports interactions between application processes on two hosts (machines). For example, it provides a mechanism with which to insert checkpoints (saving the current status of a task) into a long file transfer so that, in case of a failure, only the data after the last checkpoint need to be retransmitted. The presentation layer is concerned with such functions as transformation of data encodings, so that heterogeneous systems may engage in meaningful communication. At the highest, or application, level are protocols that support specific applications. An example of such an application is the transfer of files from one host to another. Another application allows a user working at any kind of terminal or workstation to access any host as if the user were local.
Distributed Computing
The building of networks and the establishment of communication protocols have led to distributed systems, in which computers linked in a network cooperate on tasks. A distributed database system, for example, consists of databases (see the section Information systems and databases) residing on different network sites. Data may be deliberately replicated on several different computers for enhanced availability and reliability, or the linkage of computers on which databases already reside may accidentally cause an enterprise to find itself with distributed data. Software that provides coherent access to such distributed data then forms a distributed database management system.
Client-Server Architecture
The client-server architecture has become important in designing systems that reside on a network. In a client-server system, one or more clients (processes) and one or more servers (also processes, such as database managers or accounting systems) reside on various host sites of a network. Client-server communication is supported by facilities for interprocess communication both within and between hosts. Clients and servers together allow for distributed computation and presentation of results. Clients interact with users, providing an interface to allow the user to request services of the server and to display the results from the server. Clients usually do some interpretation or translation, formulating commands entered by the user into the formats required by the server. Clients may provide system security by verifying the identity and authorization of the users before forwarding their commands. Clients may also check the validity and integrity of user commands; for example, they may restrict bank account transfers to certain maximum amounts. In contrast, servers never initiate communications; instead they wait to respond to requests from clients. Ideally, a server should provide a standardized interface to clients that is transparent, i.e., an interface that does not require clients to be aware of the specifics of the server system (hardware and software) that is providing the service. In today’s environment, in which local area networks are common, the client-server architecture is very attractive. Clients are made available on individual workstations or personal computers, while servers are located elsewhere on the network, usually on more powerful machines. In some discussions the machines on which client and server processes reside are themselves referred to as clients and servers.
A major disadvantage of a pure client-server approach to system design is that clients and servers must be designed together. That is, to work with a particular server application, the client must be using compatible software. One common solution is the three-tier client-server architecture, in which a middle tier, known as middleware, is placed between the server and the clients to handle the translations necessary for different client platforms. Middleware also works in the other direction, allowing clients easy access to an assortment of applications on heterogeneous servers. For example, middleware could allow a company’s sales force to access data from several different databases and to interact with customers who are using different types of computers.
Web servers
The other major approach to client-server communications is via the World Wide Web. Web servers may be accessed over the Internet from almost any hardware platform with client applications known as Web browsers. In this architecture, clients need few capabilities beyond Web browsing (the simplest such clients are known as network machines and are analogous to simple computer terminals). This is because the Web server can hold all of the desired applications and handle all of the requisite computations, with the client’s role limited to supplying input and displaying the server-generated output. This approach to the implementation of, for example, business systems for large enterprises with hundreds or even thousands of clients is likely to become increasingly common in the future.
Reliability is an important issue in systems architecture. Components may be replicated to enhance reliability and increase availability of the system functions. Such applications as aircraft control and manufacturing process control are likely to run on systems with backup processors ready to take over if the main processor fails, often running in parallel so the transition to the backup is smooth. If errors are potentially disastrous, as in aircraft control, results may be collected from replicated processes running in parallel on separate machines and disagreements settled by a voting mechanism. Computer scientists are involved in the analysis of such replicated systems, providing theoretical approaches to estimating the reliability achieved by a given configuration and processor parameters, such as average time between failures and average time required to repair the processor. Reliability is also an issue in distributed systems. For example, one of the touted advantages of a distributed database is that data replicated on different network hosts are more available, so applications that require the data will execute more reliably.
Real-time systems
The design of real-time systems is becoming increasingly important. Computers have been incorporated into cars, aircraft, manufacturing assembly lines, and other applications to control processes as they occur—known as “in real time.” It is not practical in such instances to provide input to the computer, allow it to compute for some indefinite length of time, and then examine the output. The computer output must be available in a timely fashion, and the processor (or processors) must be carefully chosen and the tasks specially scheduled so that deadlines are met. Frequently, real-time tasks repeat at fixed time intervals; for example, every so many seconds, sensor data are gathered and analyzed and a control signal generated. In such cases, scheduling theory is utilized by the systems designer in determining how the tasks should be scheduled on a given processor. A good example of a system that requires real-time action is the antilock braking system (ABS) on most newer vehicles; because it is critical that the ABS instantly react to brake-pedal pressure and begin a program of pumping the brakes, such an application is said to have a hard deadline. Some other real-time systems are said to have soft deadlines, in that, although it is deemed important to meet them, no disaster will happen if the system’s response is slightly delayed; an example is ocean shipping and tracking systems. The concept of “best effort” arises in real-time system design, not only because soft deadlines may sometimes be slipped, but because hard deadlines may sometimes be met by computing a less than optimal result. For example, most details on an air traffic controller’s screen are approximations—e.g., altitude, which need not be displayed to the nearest inch—that do not interfere with air safety.


Software engineering
Computer programs, the software that is becoming an ever-larger part of the computer system, are growing more and more complicated, requiring teams of programmers and years of effort to develop. As a consequence, a new subdiscipline, software engineering, has arisen. The development of a large piece of software is perceived as an engineering task, to be approached with the same care as the construction of a skyscraper, for example, and with the same attention to cost, reliability, and maintainability of the final product. The software-engineering process is usually described as consisting of several phases, variously defined but in general consisting of: (1) identification and analysis of user requirements, (2) development of system specifications (both hardware and software), (3) software design (perhaps at several successively more detailed levels), (4) implementation (actual coding), (5) testing, and (6) maintenance.
Even with such an engineering discipline in place, the software-development process is expensive and time-consuming. Since the early 1980s, increasingly sophisticated tools have been built to aid the software developer and to automate as much as possible the development process. Such computer-aided software engineering (CASE) tools span a wide range of types, from those that carry out the task of routine coding when given an appropriately detailed design in some specification language to those that incorporate an expert system to enforce design rules and eliminate software defects prior to the coding phase.
As the size and complexity of software has grown, the concept of reuse has become increasingly important in software engineering, since it is clear that extensive new software cannot be created cheaply and rapidly without incorporating existing program modules (subroutines, or pieces of computer code). One of the attractive aspects of object-oriented programming (see below Programming languages) is that code written in terms of objects is readily reused. As with other aspects of computer systems, reliability—usually rather vaguely defined as the likelihood of a system to operate correctly over a reasonably long period of time—is a key goal of the finished software product. Sophisticated techniques for testing software have therefore been designed. For example, a large software product might be deliberately “seeded” with artificial faults, or “bugs”; if they are all discovered through testing, there is a high probability that most actual faults likely to cause computational errors have been discovered as well. The need for better trained software engineers has led to the development of educational programs in which software engineering is either a specialization within computer science or a separate program. The recommendation that software engineers, like other engineers, be licensed or certified is gaining increasing support, as is the momentum toward the accreditation of software engineering degree programs.

Programming languages

Early Languages
Programming languages are the languages in which a programmer writes the instructions that the computer will ultimately execute. The earliest programming languages were assembly languages, not far removed from the binary-encoded instructions directly executed by the machine hardware. Users soon (beginning in the mid-1950s) invented more convenient languages.
The early language FORTRAN (Formula Translator) was originally much like assembly language; however, it allowed programmers to write algebraic expressions instead of coded instructions for arithmetic operations. As learning to program computers became increasingly important in the 1960s, a stripped down “basic” version of FORTRAN called BASIC (Beginner’s All-Purpose Symbolic Instruction Code) was written by John G. Kemeny and Thomas E. Kurtz at Dartmouth College, Hanover, New Hampshire, U.S., to teach novices simple programming skills. BASIC quickly spread to other academic institutions, and, beginning about 1980, versions of BASIC for personal computers allowed even students at elementary schools to learn the fundamentals of programming.
At roughly the same time as FORTRAN was created, COBOL (Common Business-Oriented Language) was developed to handle records and files and the operations necessary for simple business applications. The trend since then has been toward developing increasingly abstract languages, allowing the programmer to think and communicate with the machine at a level ever more remote from machine code.
Imperative versus functional languages
COBOL, FORTRAN, and their descendants, such as Pascal and C, are known as imperative languages, since they specify as a sequence of explicit commands how the machine is to go about solving the problem at hand; this is not very different from what takes place at the machine level. Other languages are functional, in the sense that programming is done by calling (i.e., invoking) functions or procedures, which are sections of code executed within a program. The best-known language of this type is LISP (List Processing), in which all computation is expressed as an application of a function to one or more “objects.” Since LISP objects may be other functions as well as individual data items (variables, in mathematical terminology) or data structures (see the section Data structures and algorithms), a programmer can create functions at the appropriate level of abstraction to solve the problem at hand. This feature has made LISP a popular language for artificial intelligence applications, although it has been somewhat superseded by logic programming languages such as Prolog (Programming in Logic). These are termed nonprocedural, or declarative, languages in the sense that the programmer specifies what goals are to be accomplished but not how specific methods are to be applied to attain those goals. Prolog is based on the concepts of resolution (akin to logical deduction) and unification (similar to pattern matching). Programs in such languages are written as a sequence of goals. A recent extension of logic programming is constraint logic programming, in which pattern matching is replaced by the more general operation of constraint satisfaction. Again, programs are a sequence of goals to be attained, in this case the satisfaction of the specified constraints.
Recent Developments
Object-oriented languages
An important trend in programming languages is support for data encapsulation, or object-oriented code. Data encapsulation is best illustrated by the language Smalltalk, in which all programming is done in terms of so-called objects. An object in Smalltalk or similar object-oriented languages consists of data together with the procedures (program segments) to operate on that data. Encapsulation refers to the fact that an object’s data can be accessed only through the methods (procedures) provided. Programming is done by creating objects that send messages to one another so that tasks can be accomplished cooperatively by invoking each others’ methods. This object-oriented paradigm has been very influential. For example, the language C, which was popular for engineering applications and systems development, has largely been supplanted by its object-oriented extension C++. An object-oriented version of BASIC, named Visual BASIC, is available for personal computers and allows even novice programmers to create interactive applications with elegant graphical user interfaces (GUIs).
In 1995 Sun Microsystems, Inc., introduced Java, yet another object-oriented language. Applications written in Java are not translated into a particular machine language but into an intermediate language called Java Bytecode, which may be executed on any computer (such as those using UNIX, Macintosh, or Windows operating systems) with a Java interpretation program known as a Java virtual machine. (See Program translation below.) Thus Java is ideal for creating distributed applications or Web-based applications. The applications can reside on a server in Bytecode form, which is readily downloaded to and executed on any Java virtual machine. In many cases it is not desirable to download an entire application but only an interface through which a client may communicate interactively with the application. Java applets (small chunks of application code) solve this problem. Residing on Web-based servers, they may be downloaded to and run in any standard Web browser to provide, for example, a client interface to a game or database residing on a server.
Concurrency refers to the execution of more than one procedure at the same time (perhaps with the access of shared data), either truly simultaneously (as on a multiprocessor) or in an unpredictably interleaved manner. Languages such as Ada (the U.S. Department of Defense standard applications language from 1983 until 1997) include both encapsulation and features to allow the programmer to specify the rules for interactions between concurrent procedures or tasks.
High-level languages
At a still higher level of abstraction lie visual programming languages, in which programmers graphically express what they want done by means of icons to represent data objects or processes and arrows to represent data flow or sequencing of operations. As of yet, none of these visual programming languages has found wide commercial acceptance. On the other hand, high-level user-interface languages for special-purpose software have been much more successful; for example, languages like Mathematica, in which sophisticated mathematics may be easily expressed, or the “fourth generation” database-querying languages that allow users to express requests for data with simple English-like commands. For example, a query such as “Select salary from payroll where employee = ‘Jones,’ ” written in the database language SQL (Structured Query Language), is easily understood by the reader. The high-level language HTML (HyperText Markup Language) allows nonprogrammers to design Web pages by specifying their structure and content but leaves the detailed presentation and extraction of information to the client’s Web browser.
Program Translation
Computer programs written in any language other than machine language must be either interpreted or compiled. An interpreter is software that examines a computer program one instruction at a time and calls on code to execute the operations required by that instruction. This is a rather slow process. A compiler is software that translates a computer program as a whole into machine code that is saved for subsequent execution whenever desired. Much work has been done on making both the compilation process and the compiled code as efficient as possible. When a new language is developed, it is usually at first interpreted. If the language becomes popular, it becomes important to write compilers for it, although this may be a task of considerable difficulty. There is an intermediate approach, which is to compile code not into machine language but into an intermediate language that is close enough to machine language that it is efficient to interpret—though not so close that it is tied to the machine language of a particular computer. It is use of this approach that provides the Java language with its computer-platform independence.
Operating systems
Development of Operating Systems
In early computers, the user typed programs onto punched tape or cards, from which they were read into the computer. The computer subsequently assembled or compiled the programs and then executed them, and the results were then transmitted to a printer. It soon became evident that much valuable computer time was wasted between users and also while jobs (programs to be executed) were being read or while the results were being printed. The earliest operating systems consisted of software residing in the computer that handled “batches” of user jobs—i.e., sequences of jobs stored on magnetic tape that are read into computer memory and executed one at a time without intervention by user or operator. Accompanying each job in a batch were instructions to the operating system (OS) detailing the resources needed by the job—for example, the amount of CPU time, the files and the storage devices on which they resided, the output device, whether the job consisted of a program that needed to be compiled before execution, and so forth. From these beginnings came the key concept of an operating system as a resource allocator. This role became more important with the rise of multiprogramming, in which several jobs reside in the computer simultaneously and share resources—for example, being allocated fixed amounts of CPU time in turn. More sophisticated hardware allowed one job to be reading data while another wrote to a printer and still another performed computations. The operating system was the software that managed these tasks in such a way that all the jobs were completed without interfering with one another.
Further work was required of the operating system with the advent of interactive computing, in which the user enters commands directly at a terminal and waits for the system to respond. Processes known as terminal handlers were added to the system, along with mechanisms like interrupts (to get the attention of the operating system to handle urgent tasks) and buffers (for temporary storage of data during input/output to make the transfer run more smoothly). A large computer can now interact with hundreds of users simultaneously, giving each the perception of being the sole user. The first personal computers used relatively simple operating systems, such as some variant of DOS (disk operating system), with the main jobs of managing the user’s files, providing access to other software (such as word processors), and supporting keyboard input and screen display. Perhaps the most important trend in operating systems today is that they are becoming increasingly machine-independent. Hence, users of modern, portable operating systems like UNIX, Microsoft Corporation’s Windows NT, and Linux are not compelled to learn a new operating system each time they purchase a new, faster computer (possibly using a completely different processor).
Deadlock and Synchronization
Among the problems that need to be addressed by computer scientists in order for sophisticated operating systems to be built are deadlock and process synchronization. Deadlock occurs when two or more processes (programs in execution) request the same resources and are allocated them in such a way that a circular chain of processes is formed, where each process is waiting for a resource held by the next process in the chain. As a result, no process can continue; they are deadlocked. An operating system can handle this situation with various prevention or detection and recovery techniques. For example, resources might be numbered 1, 2, 3, and so on. If they must be requested by each process in this order, it is impossible for a circular chain of deadlocked processes to develop. Another approach is simply to allow deadlocks to occur, detect them by examining nonactive processes and the resources they are holding, and break any deadlock by aborting one of the processes in the chain and releasing its resources.
Process synchronization is required when one process must wait for another to complete some operation before proceeding. For example, one process (called a writer) may be writing data to a certain main memory area, while another process (a reader) may be reading data from that area and sending it to the printer. The reader and writer must be synchronized so that the writer does not overwrite existing data with new data until the reader has processed it. Similarly, the reader should not start to read until data has actually been written to the area. Various synchronization techniques have been developed. In one method, the operating system provides special commands that allow one process to signal to the second when it begins and completes its operations, so that the second knows when it may start. In another approach, shared data, along with the code to read or write them, are encapsulated in a protected program module. The operating system then enforces rules of mutual exclusion, which allow only one reader or writer at a time to access the module. Process synchronization may also be supported by an interprocess communication facility, a feature of the operating system that allows processes to send messages to one another.
Designing software as a group of cooperating processes has been made simpler by the concept of “threads.” A single process may contain several executable programs (threads) that work together as a coherent whole. One thread might, for example, handle error signals, another might send a message about the error to the user, while a third thread is executing the actual task of the process. Modern operating systems provide management services (e.g., scheduling, synchronization) for such multithreaded processes.
Virtual Memory
Another area of operating-system research has been the design of virtual memory. Virtual memory is a scheme that gives users the illusion of working with a large block of contiguous memory space (perhaps even larger than real memory), when in actuality most of their work is on auxiliary storage (disk). Fixed-size blocks (pages) or variable-size blocks (segments) of the job are read into main memory as needed. Questions such as how much actual main memory space to allocate to users and which page should be returned to disk (“swapped out”) to make room for an incoming page must be addressed in order for the system to execute jobs efficiently. Some virtual memory issues must be continually reexamined; for example, the optimal page size may change as main memory becomes larger and quicker.
Job Scheduling
The allocation of system resources to various tasks, known as job scheduling, is a major assignment of the operating system. The system maintains prioritized queues of jobs waiting for CPU time and must decide which job to take from which queue and how much time to allocate to it, so that all jobs are completed in a fair and timely manner.
Graphical User Interfaces
A highly visible aspect of the change in operating systems in recent years is the increasingly prevalent use of graphical user interfaces (GUIs). In the early days of computing, punch cards, written in the Job Control Language (JCL), were used to specify precisely which system resources a job would need and when the operating system should assign them to the job. Later, computer consoles allowed an operator directly to type commands—e.g., to open files, run programs, manipulate data, and print results—that could be executed immediately or at some future time. (Operating system commands stored for later execution are generally referred to as scripts; scripts are still widely used, especially for controlling servers.) With the advent of personal computers and the desire to make them more user-friendly, the operating system interface has become for most users a set of icons and menus so that the user only needs to “point and click” to send a command to the operating system.
Distributed Operating Systems
With the advent of computer networks, in which many computers are linked together and are able to communicate with one another, distributed computing became feasible. A distributed computation is one that is carried out on more than one machine in a cooperative manner. A group of linked computers working cooperatively on tasks, referred to as a distributed system, often requires a distributed operating system to manage the distributed resources. Distributed operating systems must handle all the usual problems of operating systems, such as deadlock. Distributed deadlock is very difficult to prevent; it is not feasible to number all the resources in a distributed system. Hence, deadlock must be detected by some scheme that incorporates substantial communication among network sites and careful synchronization, lest network delays cause deadlocks to be falsely detected and processes aborted unnecessarily. Interprocess communication must be extended to processes residing on different network hosts, since the loosely coupled architecture of computer networks requires that all communication be done by message passing. Important systems concerns unique to the distributed case are workload sharing, which attempts to take advantage of access to multiple computers to complete jobs faster; task migration, which supports workload sharing by efficiently moving jobs among machines; and automatic task replication at different sites for greater reliability. These concerns, in addition to the overall design of distributed operating systems and their interaction with the operating systems of the component computers, are subjects of current research.

Information systems and databases

File Storage
Computers have been used since the 1950s for the storage and processing of data. An important point to note is that the main memory of a computer provides only temporary storage; any data stored in main memory is lost when the power is turned off. For the permanent storage of data, one must turn to auxiliary storage, primarily magnetic and optical media such as tapes, disks, and CDs. Data is stored on such media but must be read into main memory for processing. A major goal of information-system designers has been to develop software to locate specific data on auxiliary storage and read it efficiently into main memory for processing. The underlying structure of an information system is a set of files stored permanently on some secondary storage device. The software that comprises a file management system supports the logical breakdown of a file into records. Each record describes some thing (or entity) and consists of a number of fields, where each field gives the value of some property (or attribute) of the entity. A simple file of records is adequate for uncomplicated business data, such as an inventory of a grocery store or a collection of customer accounts.
Early file systems were always sequential, meaning that the successive records had to be processed in the order in which they were stored, starting from the beginning and proceeding down to the end. This file structure was appropriate and was in fact the only one possible when files were stored solely on large reels of magnetic tape and skipping around to access random data was not feasible. Sequential files are generally stored in some sorted order (e.g., alphabetic) for printing of reports (e.g., a telephone directory) and for efficient processing of batches of transactions. Banking transactions (deposits and withdrawals), for instance, might be sorted in the same order as the accounts file, so that as each transaction is read the system need only scan ahead (never backward) to find the accounts record to which it applies.
When so-called direct-access storage devices (DASDs; primarily magnetic disks) were developed, it became possible to access a random data block on the disk. (A data block is the unit of transfer between main memory and auxiliary storage and usually consists of several records.) Files can then be indexed so that an arbitrary record can be located and fetched (loaded into the main memory). An index of a file is much like an index of a book; it consists of a listing of identifiers that distinguish the records (e.g., names might be used to identify personnel records), along with the records’ locations. Since indexes might be long, they are usually structured in some hierarchical fashion and are navigated by using pointers, which are identifiers that contain the address (location in memory) of some item. The top level of an index, for example, might contain locations of (point to) indexes to items beginning with the letters A, B, etc. The A index itself may contain not locations of data items but pointers to indexes of items beginning with the letters Ab, Ac, and so on. Reaching the final pointer to the desired record by traversing such a treelike structure is quite rapid. File systems making use of indexes can be either purely indexed, in which case the records need be in no particular order and every individual record must have an index entry that points to the record’s location, or they can be “indexed-sequential.” In this case a sort order of the records as well as of the indexes is maintained, and index entries need only give the location of a block of sequentially ordered records. Searching for a particular record in a file is aided by maintaining secondary indexes on arbitrary attributes as well as by maintaining a primary index on the same attribute on which the file is sorted. For example, a personnel file may be sorted on (and maintain a primary index on) employee identification numbers, but it might also maintain indexes on names and departments. An indexed-sequential file system supports not only file search and manipulation commands of both a sequential and index-based nature but also the automatic creation of indexes.
Types of Database Models
File systems of varying degrees of sophistication satisfied the need for information storage and processing for several years. However, large enterprises tended to build many independent files containing related and even overlapping data, and data-processing activities frequently required the linking of data from several files. It was natural, then, to design data structures and database management systems that supported the automatic linkage of files. Three database models were developed to support the linkage of records of different types. These are: (1) the hierarchical model, in which record types are linked in a treelike structure (e.g., employee records might be grouped under a record describing the departments in which employees work); (2) the network model, in which arbitrary linkages of record types may be created (e.g., employee records might be linked on one hand to employees’ departments and on the other hand to their supervisors—that is, other employees); and (3) the relational model, in which all data are represented in simple tabular form.
In the relational model, the description of a particular entity is provided by the set of its attribute values, stored as one row of the table, or relation. This linkage of n attribute values to provide a meaningful description of a real-world entity or a relationship among such entities forms a mathematical n-tuple; in database terminology, it is simply called a tuple. The relational approach also supports queries (requests for information) that involve several tables by providing automatic linkage across tables by means of a “join” operation that combines records with identical values of common attributes. Payroll data, for example, could be stored in one table and personnel benefits data in another; complete information on an employee could be obtained by joining the tables on the employee’s identification number. To support any of these database structures, a large piece of software known as a database management system (DBMS) is required to handle the storage and retrieval of data (via the file management system, since the data are physically stored as files on magnetic disk) and to provide the user with commands to query and update the database. The relational approach is currently the most popular, as older hierarchical data management systems, such as IMS, the information management system produced by IBM, are being replaced by relational database management systems such as IBM’s large mainframe system DB2 or the Oracle Corporation’s DBMS, which runs on large servers. Relational DBMS software is also available for workstations and personal computers.
The need for more powerful and flexible data models to support nonbusiness applications (e.g., scientific or engineering applications) has led to extended relational data models in which table entries need not be simple values but can be programs, text, unstructured data in the form of binary large objects (BLOBs), or any other format the user requires. Another development has been the incorporation of the object concept that has become significant in programming languages. In object-oriented databases, all data are objects. Objects may be linked together by an “is-part-of” relationship to represent larger, composite objects. Data describing a truck, for instance, may be stored as a composite of a particular engine, chassis, drive train, and so forth. Classes of objects may form a hierarchy in which individual objects may inherit properties from objects farther up in the hierarchy. For example, objects of the class “motorized vehicle” all have an engine; members of subclasses such as “truck” or “airplane” will then also have an engine. Furthermore, engines are also data objects, and the engine attribute of a particular vehicle will be a link to a specific engine object. Multimedia databases, in which voice, music, and video are stored along with the traditional textual information, are becoming increasingly important and also are providing an impetus toward viewing data as objects, as are databases of pictorial images such as photographs or maps. The future of database technology is generally perceived to be a merging of the relational and object-oriented views.
Data Integrity
Integrity is a major database issue. In general, integrity refers to maintaining the correctness and consistency of the data. Some integrity checking is made possible by specifying the data type of an item. For example, if an identification number is specified to be nine digits, the DBMS may reject an update attempting to assign a value with more or fewer digits or one including an alphabetic character. Another type of integrity, known as referential integrity, requires that an entity referenced by the data for some other entity must itself exist in the database. For example, if an airline reservation is requested for a particular flight number, then the flight referenced by that number must actually exist. Although one could imagine integrity constraints that limit the values of data items to specified ranges (to prevent the famous “computer errors” of the type in which a $10 check is accidentally issued as $10,000), most database management systems do not support such constraints but leave them to the domain of the application program.
Access to a database by multiple simultaneous users requires that the DBMS include a concurrency control mechanism to maintain the consistency of the data in spite of the possibility that a user may interfere with the updates attempted by another user. For example, two travel agents may try to book the last seat on a plane at more or less the same time. Without concurrency control, both may think they have succeeded, while only one booking is actually entered into the database. A key concept in studying concurrency control and the maintenance of database correctness is the transaction, defined as a sequence of operations on the data that transform the database from one consistent state into another. To illustrate the importance of this concept, consider the simple example of an electronic transfer of funds (say $5) from bank account A to account B. The operation that deducts $5 from account A leaves the database inconsistent in that the total over all accounts is $5 short. Similarly, the operation that adds $5 to account B in itself makes the total $5 too much. Combining these two operations, however, yields a valid transaction. The key to maintaining database correctness is therefore to ensure that only complete transactions are applied to the data and that multiple concurrent transactions are executed (under a concurrency control mechanism) in such a way that a serial order can be defined that would produce the same results. A transaction-oriented control mechanism for database access becomes difficult in the case of so-called long transactions—for example, when several engineers are working, perhaps over the course of several days, on a product design that may not reach a consistent state until the project is complete. The best approach to handling long transactions is a current area of database research.
As discussed above, databases may be distributed, in the sense that data reside at different host computers on a network. Distributed data may or may not be replicated, but in any case the concurrency-control problem is magnified. Distributed databases must have a distributed database management system to provide overall control of queries and updates in a manner that ideally does not require that the user know the location of the data. The attainment of the ideal situation, in which various databases fall under the unified control of a distributed DBMS, has been slowed both by technical problems and by such practical problems as heterogeneous hardware and software and database owners who desire local autonomy. Increasing mention is being made of more loosely linked collections of data, known by such names as multidatabases or federated databases. A closely related concept is interoperability, meaning the ability of the user of one member of a group of disparate systems (all having the same functionality) to work with any of the systems of the group with equal ease and via the same interface. In the case of database management systems, interoperability means the ability of users to formulate queries to any one of a group of independent, autonomous database management systems using the same language, to be provided with a unified view of the contents of all the individual databases, to formulate queries that may require fetching data via more than one of the systems, and to be able to update data stored under any member of the group. Many of the problems of distributed databases are the problems of distributed systems in general. Thus distributed databases may be designed as client-server systems, with middleware easing the heterogeneity problems.
Database Security
Security is another important database issue. Data residing on a computer is under threat of being stolen, destroyed, or modified maliciously. This is true whenever the computer is accessible to multiple users but is particularly significant when the computer is accessible over a network. The first line of defense is to allow access to a computer only to authorized, trusted users and to authenticate those users by a password or similar mechanism. But clever programmers have learned how to evade such mechanisms, designing, for example, so-called computer viruses—programs that replicate themselves and spread among the computers in a network, “infecting” systems and potentially destroying files. Data can be stolen by devices such as “Trojan horses”—programs that carry out some useful task but contain hidden malicious code—or by simply eavesdropping on network communications. The need to protect sensitive data (e.g., for national security) has led to extensive research in cryptography and the development of encryption standards for providing a high level of confidence that the data is safe from decoding by even the most powerful computer attacks. The term computer theft, however, usually refers not to theft of information from a computer but rather to theft by use of a computer, typically by modifying data. If a bank’s records are not adequately secure, for example, someone could set up a false account and transfer money into it from valid accounts for later withdrawal.
Artificial intelligence
Artificial intelligence (AI) is an area of research that goes back to the very beginnings of computer science. The idea of building a machine that can perform tasks perceived as requiring human intelligence is an attractive one. The tasks that have been studied from this point of view include game playing, language translation, natural-language understanding, fault diagnosis, robotics, and supplying expert advice. For a detailed discussion of the successes—and failures—of AI over the years, see the article artificial intelligence.
Computer graphics
Computer graphics is the field that deals with display and control of images on the computer screen. Applications may be broken down into four major categories: (1) design (computer-aided design [CAD] systems), in which the computer is used as a tool in designing objects ranging from automobiles to bridges to computer chips by providing an interactive drawing tool and an interface to simulation and analysis tools for the engineer; (2) fine arts, in which artists use the computer screen as a medium to create images of impressive beauty, cinematographic special effects, animated cartoons, and television commercials; (3) scientific visualization, in which simulations of scientific events—such as the birth of a star or the development of a tornado—are exhibited pictorially and in motion so as to provide far more insight into the phenomena than would tables of numbers; and (4) human-computer interfaces.
Graphics-based computer interfaces, which enable users to communicate with the computer by such simple means as pointing to an icon with a handheld device known as a mouse, have allowed millions of ordinary people to control application programs like spreadsheets and word processors. Graphics technology also supports windows (display boxes) environments on the workstation or personal computer screen, which allow users to work with different applications simultaneously, one in each window. Graphics also provide realistic interfacing to video games, flight simulators, and other simulations of reality or fantasy. The term virtual reality has been coined to refer to interaction with a computer-simulated virtual world.
A challenge for computer science has been to develop algorithms for manipulating the myriad lines, triangles, and polygons that make up a computer image. In order for realistic on-screen images to be generated, the problems introduced in approximating objects as a set of planar units must be addressed. Edges of objects are smoothed so that the underlying construction from polygons is not visible, and representations of surfaces are textured. In many applications, still pictures are inadequate, and rapid display of real-time images is required. Both extremely efficient algorithms and state-of-the-art hardware are needed to accomplish such real-time animation. Technical details of graphics displays are discussed in computer graphics.


Computational methods and numerical analysis
The mathematical methods needed for computations in engineering and the sciences must be transformed from the continuous to the discrete in order to be carried out on a computer. For example, the computer integration of a function over an interval is accomplished not by applying integral calculus to the function expressed as a formula but rather by approximating the area under the function graph by a sum of geometric areas obtained from evaluating the function at discrete points. Similarly, the solution of a differential equation is obtained as a sequence of discrete points determined, in simplistic terms, by approximating the true solution curve by a sequence of tangential line segments. When discretized in this way, many problems can be recast in the form of an equation involving a matrix (a rectangular array of numbers) that is solvable with techniques from linear algebra. Numerical analysis is the study of such computational methods. Several factors must be considered when applying numerical methods: (1) the conditions under which the method yields a solution, (2) the accuracy of the solution, and, since many methods are iterative, (3) whether the iteration is stable (in the sense of not exhibiting eventual error growth), and (4) how long (in terms of the number of steps) it will generally take to obtain a solution of the desired accuracy.
The need to study ever-larger systems of equations, combined with the development of large and powerful multiprocessors (supercomputers) that allow many operations to proceed in parallel by assigning them to separate processing elements, has sparked much interest in the design and analysis of parallel computational methods that may be carried out on such parallel machines.
Data structures and algorithms
A major area of study in computer science has been the storage of data for efficient search and retrieval. The main memory of a computer is linear, consisting of a sequence of memory cells that are numbered 0, 1, 2,… in order. Similarly, the simplest data structure is the one-dimensional, or linear, array, in which array elements are numbered with consecutive integers and array contents may be accessed by the element numbers. Data items (a list of names, for example) are often stored in arrays, and efficient methods are sought to handle the array data. Search techniques must address, for example, how a particular name is to be found. One possibility is to examine the contents of each element in turn. If the list is long, it is important to sort the data first—in the case of names, to alphabetize them. Just as the alphabetizing of names in a telephone book greatly facilitates their retrieval by a user, the sorting of list elements significantly reduces the search time required by a computer algorithm as compared to a search on an unsorted list. Many algorithms have been developed for sorting data efficiently. These algorithms have application not only to data structures residing in main memory but even more importantly to the files that constitute information systems and databases.
Although data items are stored consecutively in memory, they may be linked together by pointers (essentially, memory addresses stored with an item to indicate where the “next” item or items in the structure are found) so that the items appear to be stored differently than they actually are. An example of such a structure is the linked list, in which noncontiguously stored items may be accessed in a prespecified order by following the pointers from one item in the list to the next. The list may be circular, with the last item pointing to the first, or may have pointers in both directions to form a doubly linked list. Algorithms have been developed for efficiently manipulating such lists—searching for, inserting, and removing items.
Pointers provide the ability to link data in other ways. Graphs, for example, consist of a set of nodes (items) and linkages between them (known as edges). Such a graph might represent a set of cities and the highways joining them or the layout of circuit elements and connecting wires on a VLSI chip. Typical graph algorithms include solutions to traversal problems, such as how to follow the links from node to node (perhaps searching for a node with a particular property) in such a way that each node is visited only once. A related problem is the determination of the shortest path between two given nodes. (For background on the mathematical theory of networks, see the article graph theory.) A problem of practical interest in designing any network is to determine how many “broken” links can be tolerated before communications begin to fail. Similarly, in VLSI chip design it is important to know whether the graph representing a circuit is planar, that is, whether it can be drawn in two dimensions without any links crossing each other.
Impact of computer systems
The preceding sections of this article give some idea of the pervasiveness of computer technology in society. Many products used in everyday life now incorporate computer systems: programmable, computer-controlled VCRs in the living room, programmable microwave ovens in the kitchen, programmable thermostats to control heating and cooling systems—the list seems endless. This section will survey a few of the major areas where computers currently have—or will likely soon have—a major impact on society. As noted below, computer technology not only has solved problems but also has created some, including a certain amount of culture shock as individuals attempt to deal with the new technology. A major role of computer science has been to alleviate such problems, mainly by making computer systems cheaper, faster, more reliable, and easier to use.
Computers in the workplace
Computers are omnipresent in the workplace. Word processors—computer software packages that simplify the creation and modification of textual documents—have largely replaced the typewriter. Electronic mail has made it easy to transmit textual messages (possibly containing embedded picture and sound files) worldwide, using computers, cellular telephones, and specially equipped televisions via telephone, satellite, and cable television networks. Office automation has become the term for linking workstations, printers, database systems, and other tools by means of a local area network (LAN). An eventual goal of office automation has been termed the “paperless office.” Although such changes ultimately make office work much more efficient, they have not been without cost in purchasing and frequently upgrading the necessary hardware and software and in training workers to use the new technology.
Computer-integrated manufacturing (CIM) is a relatively new technology arising from the application of many computer science subdisciplines to support the manufacturing enterprise. The technology of CIM emphasizes that all aspects of manufacturing should be not only computerized as much as possible but also linked into an integrated whole via a computer communication network. For example, the design engineer’s workstation should be linked into the overall system so that design specifications and manufacturing instructions may be sent automatically to the shop floor. The inventory databases should be linked in as well, so product inventories may be incremented automatically and supply inventories decremented as manufacturing proceeds. An automated inspection system (or a manual inspection station supplied with online terminal entry) should be linked to a quality-control system that maintains a database of quality information and alerts the manager if quality is deteriorating and possibly even provides a diagnosis as to the source of any problems that arise. Automatically tracking the flow of products from station to station on the factory floor allows an analysis program to identify bottlenecks and recommend replacement of faulty equipment. In short, CIM has the potential to enable manufacturers to build cheaper, higher quality products and thus improve their competitiveness. Implementing CIM is initially costly, of course, and progress in carrying out this technology has been slowed not only by its cost but also by the lack of standardized interfaces between the various CIM components and by the slow acceptance of standardized communication protocols to support integration. Although the ideal of CIM is perhaps just beyond reach at the present time, manufacturers are now able to improve their operations by, for example, linking robot controllers to mainframes for easy and correct downloading of revised robot instructions. Also available are elaborate software packages that simplify the building of databases for such applications as inventories, personnel statistics, and quality control and that incorporate tools for data analysis and decision support.
  • Recommend Us