Operating systems have traditionally provided five main functions as in the text's Ch 1:
Prior to the '90s the purchase of a separate 'network operating system' was required if a company wanted to network their computers. And a 'server operating system', midrange or mainframe was required for multi-user applications that needed security. Since the '90s as systems became more and more 'client/server' oriented and by more than one person, practically every OS likely to be used in a home or business has provided two more functions:
Here's discussion of these OS functions. They will be demonstrated during class using Windows and Linux:
The User Interface, often called a 'shell', is what we see and use. Windows provides a GUI shell that handles most OS functions for ordinary users. It also provides a Windows Command Line (no more DOS since XP) Window where a 'command line interface' is available for less-than-ordinary tasks by owners or network managers.
A Linux or Unix user may have a choice of GUI by using one of the popular XWindows interfaces like Gnome, KDE, or Unity maybe enhanced by Compiz Fusion or other eye-candy for desktop functions. But, many system and network management functions require use of the 'command line' or 'character based' interfaces. Some servers, routers, and other industrial-strength OSs don't provide a GUI at all.
We'll cover a few OS/shell combinations in later weeks. The shell allows a user to control the other four components.
OSs also provide other, non-human, interfaces: the API-Application Program Interface provides a way for programs to interface with the OS, allowing a script to direct the OS to do something or query it to get data like file-sizes or the current system date/time. In OS that support web servers there is the CGI, Common Gateway Interface that makes it easy to get data from browsers' or web-services' GET & POST data and Cookies.
On a PC Device Management components control access to: peripherals like keyboard, mouse, monitor, printers, speakers, &c; and, Secondary Storage devices like disk, CD, or tape; network devices; and anything else that is put into a slot or plugged into a PC Slot or USB -- usually the same CPU (or CPUs) that run the OS and programs manage devices. On a larger machine, mid-range or mainframe, specialized CPUs might be dedicated to managing devices so the CPUs that run programs aren't interrrupted for device management tasks.
Spoolers Printers and other 'asynchronous' processes like email generally have a 'spooler' so that jobs don't get mangled together when a network full of people are printing and handling email.
When a print job starts, the output is first placed in 'spool files' on disk, then doled out to the printers using rules set by a network manager. When the job has printed the spool file is deleted. 'Runaway spoolers' are perhaps the biggest cause of system crashes, so careful placement of spool files is critical to prevent spooled print jobs from consuming all the space on a disk...
File Management functions let users, or application software, do the basic 'file functions', 'CRUD', with files: create, read, update, delete, + copy. The OS generally handles entire files, locating them in a hierarchical directory path (hierarchy commonly denoted with \ in Windows, / in *ix, or : in early Mac) and by name.
Windows and Linux both provide features for searching file systems using 'pattern matching' or other techniques, so that we can find most anything if we can recollect bits or snippets of the content.
To get at the data _inside_ the files usually takes some kind of application software, sometimes provided by the OS and other times purchased thru some developer of software for a platform. For example, OS components called 'editors' (like vi in unix or notepad in Windows) allow users to modify the contents of ASCII, 'plain text', files. But, another, 'non-OS provided' editor provides more functionality for the non-casual user, especially a programmer or administrator, so we find data processing pros choose editors like EMACS, Midnight Commander, Crimson, Visual Studio, DreamWeaver MX, or FrontPage (aggh!) to provide extra power for the particular tasks at hand.
The contents of files containing other data for images, spreadsheets, or Word documents are maintained by applications like GIMP, Excel, or Word.
Memory Management is concerned with the system's Primary Storage, RAM and Cache, located further and nearer from The CPUs. Many operating systems provide Virtual Memory to swap the contents of RAM to disk when there is not enough contiguous space in RAM to handle demands from active processes. DMA-Direct Memory Access channels built into a computer's switch sets move most data without burdening the CPU.
Memory management includes 'Garbage Collection' to return memory no longer needed by retiring processes to the pool for new processes to use. Garbage collection routines attempt to regain large, contiguous, blocks of space and keep them available for the next process that is launched. 'Memory Leakage', which is rampant in some application environments, especially Windows IMHO, which may result in 'blue screen of death' that was so familiar to NT administrators if they failed to reboot their servers often enough. Later versions of Windows, personal and server, handle memory leakage more gracefully but haven't entirely ridded the environment of the problem. AIX and other UNIXes, especially the non-stop versions, may run their entire service life without leaking memory.
Process Management schemes in personal devices and larger computers are almost all 'multiprocessing' these days. Flip phones and other primitive personal devices only run one process at a time. Everything else runs at least several concurrent processes, with mid-range and mainframe computers handling millions at a time.
Using ps (process status) in *ix or the Process Manager (ctrl-alt-delete) in Windows shows a list of processes that are running on your computer. Android and iOS can also show us the processes that are running, maybe using up the battery? An operating system's process management functions more-or-less equitably divvy up the limited resource of CPU Time.
In most GUI desktop systems (Windows, Mac, or Linux) there are usually a couple or a few dozen 'processes' vying for compute cycles. Even in a single 32-bit CPU, space in RAM & Cache, and bandwidth on a disk or network controller has been more than adequate to keep a person happy. These systems have provided very 'crisp' response since processors have been at 400-800 MegaHz, where an active GUI presentation & mouse-event capturing was taking about 80% of the CPU's power. Today's 64-bit, multi-core processors only use a few percent of their bandwidth to handle our GUIs and make our applications run faster than ever.
The busy 'host computer' or server that is running an enterprise may have hundreds or thousands of users entering and using data processed by the applications it runs. Each user may be running running one or more processes as they do their work. The mid-range or mainframe computer has more resources and more 'channels' & 'dedicated processors' to manage them. For example, a 32-bit desktop PC could manage 4 GigaBytes of RAM and this was OK for a gamer or engineer running a 'compute intensive' task, or a small server handling a few dozen users, and 'disk swapping' is minimal.
But, a computer with 64-bit technology is able to reference a TeraByte or more of RAM and avoid much disk access altogether, working directly from huge RAM-drives. (IBM has been providing dual-core, 64-bit processors in the Power line since the '90s and 8-core Cell technology since about Y2K. Intel, AMD, and others got there about 2006 and they're headed to the desktops.)
Larger mid-range and mainframe machines can handle several or lots of TeraBytes of RAM and juggle millions of processes among dozens of CPUs to satisfy hundreds of thousands or millions of users' processes.
Programming for multiple CPUs can be very difficult. Luckily, technology has mostly automated the job so that most application developers don't have to worry with the complexity of multi-programming to take advantage of the multi-processor environment. They write the code and the OS figures out how best to deploy it on however many CPUs are available.
Along with referencing relatively huge amounts of memory and cache with their 64-bit CPU words, modern workstations, gaming machines, mini-computers, and mainframes may have two or more CPUs/Cores working in parallel using 'SMP' (Symmetrical Multi Processing). This allows 'multi-threading' techniques of modern OSs & programming languages to be used so that an application's processes can be programmed to run concurrently, when appropriate, instead of in sequence as is required when only one CPU is available. SMP components on the CPUs, mainboards, and operating systems make all this happen automatically. Languages like C++, Java, and VS.NET let programmers write code to take better advantage of multi-CPUs if needed.
An OS that supports SMP automatically divvies up tasks for the multiple CPUs without any of the Programmers' concern. These machines & OSs can service thousands of time-shared Users' keystrokes & requests for database access so efficiently that they all get sub-second response times. Of course, any computer can be 'over loaded' and Users of inadequately-sized systems decorated their cubicles with a picture of skeleton sitting in front of a computer terminal over the caption 'How is the response time?'.
As an historical note, the Motorola 68000 line of included features for communication with a 'supervisory processor' on a board or a backplane that made them more suitable for deployment in a multi-CPU environment for years before Intel processors. When the Intel's 80486 processors came out with this feature, manufacturers of 'highly-available' or 'fault-tolerant' hardware platforms that had used RISC CPUs for a decade, like Stratus, were able to use whichever CPU gave the best bang/buck performance in the season the machine was delivered. This allowed them to deploy multiple unix or Windows servers in one fault-tolerant chassis.
Where we oldsters used to say that RISC processors were better suited for multi-CPU deployment of enterprise applications, there are lots of systems with CISC CPUs that do it well, too. ARM-64 technology may push the server market more firmly toward RISC again, with both Google and Facebook deploying RISC in their data centers. SoC-Systems on Chips built on with ARM-64 technology include Ethernet Fabric processors on the same die with the CPU and are ideal for hyper-converged systems using spine and leaf architecture for their storage-area networks.
We used to see, through the '90s, that the 'point of diminishing returns' was reached at something less than six or eight CPUs in a SMP scheme. But now that number is higher, with server-class machines handling a couple dozen cores per socket and mid-range computers making effective use of hundreds of CPUs for SMP.
IBM provides the ultimate in process control, 'Capacity on Demand', where systems are shipped with additional CPUs to be used, and paid for, only during peak seasons and turned off at other times -- since many of their customers are in retail and 'mail order' they need more CPUs in the lead up to the Holidays in late Fall and the returns season than the other 10 months of the year.
The text discusses the same Five OS Functions the instructor's been discussing since the '80s. Since networks and The Internet came on the scene personal and server operating systems have included two more critical functions:
AAA Authenticating users, Authorizing their access to system resources, and Accounting for their activity are parts of modern OS like Unix, Linux, or Windows NT or 2003 Server. Even a 'desktop OS' like XP or an OS for a portable device is likely to include methods for someone with administrator privileges on the machine to set up profiles for individuals who will be using the machine. DOS and early Apple OS, on the other hand, had no way of identifying individual users, and made everything on a machine more-or-less available to whoever flipped the power on.
A UserId and PassWord combination is involved in most methods for authenticating users as they 'log on' to a server or host machine. Magnetic stripes, rf chips, or biometric devices are also involved in some systems.
Modern 'multifactor' authentication schemes involve more than one factor to authenticate persons:
Other schemes for authenticating users as they connect from one node on a network to another involve keeping a 'private key' in the user's home directory on their 'home machine' and storing a 'public key' in their home directory on machines they visit. This also works to support non-repudiation in eBusiness, where part of the setup for an EDI trading partnership includes exchange of public keys. The PKI-Public Key Infrastructure provides a very secure authentication scheme.
You're welcome to set up rsa keys at info300.net so you can log in from a trusted device without keying the password.
In organizations where there may be a large number of servers it would be extremely inconvenient for a user to be challenged for a password by each of the servers/domain controllers that provides services. In these cases, a 'super domain' scheme like Kerberos, LDAP-Lightweight Directory Access Protocol, or Windows Active Directory is used to authenticate users once and then 'trust' and continue to authorize their use of resources in a secure way. VCU's CAS-Central Authentication Service works this way -- log in once and get access to BlackBoard, VCU's GMail and Google Apps, eServices, and other university on-line resources.
Ethernet and Internet protocols have been included in OSs since the mid-80's. Windows, after 3.11 for workgroups, and Unix have provided networking functions built into the operating system. For years, Apple bundled AppleTalk into their Mac OS and started providing Internet and Ethernet in the late '80s.
Earlier desktop and server OSs, like DOS or CPM, required separate purchase of a 'Network OS' like Novell, LanTastic, Banyon Vines, ThinNet or other network OS so that a PC could share networked resources. Since Windows 3.11 for Workgroups Microsoft included support for SMB and other protocols for a GUI-managed 'Network Neighborhood' with peer-to-peer and client-server relationships.
All today's personal operating systems have support for Ethernet and Internet protocols built-in. Linux, Windows, Mac, Android, iOS, even Windows CE can do Ethernet and Internet via Cell or LAN. We expect our computers to come out of the box able to participate in LANs and access The Internet as presented by our ISP. Personal devices from smartphones through notebooks and desktops handle the several protocols of the TCP/IP suite, like SMTP, ICMP, POP3, IMAP, HTTP, SFTP, SSH, SSL. TCP, IP.
Common Server OSs are any of these: Proprietary unices like AIX or HP/UX, or SCO UNIXWare; Open Source unices like Linux or OpenBSD; IBM's proprietary non-unix i5/OS or z/OS; or Windows proprietary Server 2000+. These all have protocols built-in for Ethernet, Internet, plus they can interface directly to T-Carrier, E-Carrier for telephone and OC-Optical Carrier for high-speed fiber optices. Since the late '90s practically all servers handle security protocols like SSH, SFTP, and public/private key TLS/SSL for browsers, web services, and applications. Linux and unix servers may be used as gateway, firewall, and/or proxy for internet traffic of http, smtp, pop3 on wired or optical circuits.
'Virtual Memory' is a scheme used in multi-tasking OSs where process, memory, and file system management cooperate to run lots of processes thru a number of CPUs that reference the same RAM. Most OSs since the '90s are multi-tasking, whether they operate systems for one or millions of users. PC OSs like Mac, Windoze, or Ubuntu run lots of tasks for a single user -- the W7 I'm using now is running 107 processes for me at the time of this keystroking. Droids run dozens of processes -- my Droid GingerBread is running 47
Servers, like Windows Server, Linux, or zOS run a few tasks at a time for each of the dozens, hundreds, or millions of users attached to the server. Info300.net, aka info202.info, reliably supports a lab or two of 30 users pounding at vi, debugging websites, and running scripts and database queries -- each of these users runs two or three tasks at a time, and would hit a hard limit at about 32,000 tasks. A busy pair of mainframes can handle millions of users' tasks at a time, some of them for employees using the enterprise's application software and others for customers using the enterprise's websites.
As each task is presented to the OS executive, it is assigned a location in 'virtual memory' located on the system's disk and the application's code and data are placed there. The task is scheduled to run on a CPU until completion, usually according to a 'timeslicing' scheme where lots of tasks share a CPU. When a task has a slice of time its code and data are moved into RAM where the data and instructions are available to the CPU and its registers. When the task's timeslice is used, the OS moves the data and code back to Disk and moves the next task's data and code to RAM. This 'swapping' to and from virtual memory to RAM goes on & on until the task is complete.
The benefit of virtual memory is that a multi-tasking server can handle more users' work simultaneously, or a personal computer can handle more tasks for a user at the same time. The cost is that disk access is much slower than RAM access, so a system that needs to swap runs lots slower than one with enough RAM to accomodate all the users' processes' code and data without swapping.
DMA and a large, contiguous 'swap area'
laid out cylinder-by-cylinder on disc make swapping as quick as it can be, but swapping is
always slow compared to systems that have so much RAM that they don't need to do much swapping.
This is true whether the system is a notebook used by one person, or a server used by many.
Servers that need to process 'real-time' are configured with enough RAM so that
swapping isn't necessary.
Occasionally, virtual memory is used to do the 'garbage collection' required to get enough contiguous space in RAM to launch a large process. We may experience this as several seconds, or more, of a paused interface where we wonder if the system has 'hung up' or 'frozen'.
Huge RAMs on 'Heavy Iron' Mid-range and mainframe computers, with
their HUGE RAMs (Several to many TeraBytes!)
are able to handle many thousand through millions of tasks without swapping to virtual memory at all
-- they do the timeslicing, but don't have to do much swapping.
The sloth of swapping can be demo'd on a personal scale by running Windows Vista or 7 on a machine with only 1 gig of RAM and then loading a few memory-loving apps, or by switching between applications on a cheap tablet with only 256MBytes of RAM. It can be done, but the OS takes up most of the CPUs' bandwidth leaving little available for the users' tasks. If every application has to swap as it runs its timeslice, the effect is not a pleasant experience for the user.
Thrashing is the official term for this. This wiki link describes episodes of thrashing, where the virtual memory system is spending more time swapping users' tasks than processing their work, taking a _lot_ of system resources without doing much real work, ticking off users, and customers, who get a multi-second response where sub-second is the best.
If a non-scalable application environment was chosen, a thrashing system can be the death of an enterprise which can't afford to upgrade or jump to a scalable platform. If scalability was a key objective, an enterprise anticipated exponential growth and can handle it by adding more hardware or computers, quickly 'throwing hardware at it' to keep pace with new acquisitions and markets.
Midrange and mainframes systems with two or three large chassis that can hold lots of components can hold multi-Terabytes of RAM, minimizing the need for swapping, and run literally thousands of times quicker than smaller machines that need to swap.
Server and blade farms made up of smaller machines can also handle exponential growth and give millions of users good response time using 'load balancing', high-speed storage networks, and a lot of power and a/c. The trade-off to give millions of users sub-second response time is a couple of farms of a couple thousand 'server class' machines vs. a couple of big mainframes.