Latest Movie :

Distributed System Notes Chapter-1 For All IVth BTECH Students

Chapter 1 CHARACTERIZATION OF DISTRIBUTED SYSTEMS 1.1 INTRODUCTION Distributed system contains hardware or software components located at networked computers communicate and coordinate their actions only by passing messages. It covers entire range of systems in which networked computers can usefully be deployed. Computers that are connected by a network may be spatially separated by any distance. They may be on separate continents, in the same building or in the same room. The prime motivation for constructing and using distributed systems stems from a desire to share resources. The term ‘resource’ is a rather abstract one, but it best characterizes the range of things that can usefully be shared in a networked computer system. It extends from hardware components such as disks and printers to software-defined entities such as files, databases and data objects of all kinds. A distributed system has the following significant consequences: Concurrency: In a network of computers, concurrent program execution is the norm. The capacity of the system to handle shared resources can be increased by adding more resources (for example. computers) to the network. The coordination of concurrently executing programs that share resources is also an important and recurring topic. No global clock: When programs need to cooperate they coordinate their actions by exchanging messages. Close coordination often depends on a shared idea of the time at which the programs’ actions occur. But it turns out that there are limits to the accuracy with which the computers in a network can synchronize their clocks – there is no single global notion of the correct time. This is a direct consequence of the fact that the only communication is by sending messages through a network. Independent failures: All computer systems can fail, and it is the responsibility of system designers to plan for the consequences of possible failures. Distributed systems can fail in new ways. Faults in the network result in the isolation of the computers that are connected to it, but that doesn’t mean that they stop running. In fact, the programs on them may not be able to detect whether the network has failed or has become unusually slow. Similarly, the failure of a computer, or the unexpected termination of a program somewhere in the system (a crash), is not immediately made known to the other components with which it communicates. Each component of the system can fail independently, leaving the others still running. 1.2 EXAMPLES OF DISTRIBUTED SYSTEMS 1.2.1 The Internet The internet is vast inter connected collection of computer networks of many different types. Figure 1.1 illustrates a typical portion of the Internet. Programs running on the computers connected to it interact by passing messages, employing a common means of communication. The design and construction of the Internet communication mechanisms (Internet Protocols) is a major technical achievement, enabling a program running anywhere to address messages to programs anywhere else. The internet is also a very large distributed system. It enables users, whenever they are, to make use of services such as the World Wide Web, email and file transfer. The set of services are extended by the addition of server components and new types of services. The intranets are linked together by backbones. A backbone is a network link with a high transmission capacity, employing satellite connections, fiber optic cables and other high bandwidth circuits. Multimedia services are available in the internet, enabling users to access audio and video data including music, radio and TV channels. The implementation of the Internet and the services that it supports has entailed the development of practical solutions to many distributed system issues. Figure 1.1 A Typical Portion of the Internet 1.2.2 Intranets An intranet is a portion of the internet that is separately administrated and has a boundary that can be configured to enforce local security policies. Figure 1.2 shows a typical intranet. It is composed of several local area networks (LANs) linked by backbone connections. The network configuration of a particular intranet is the responsibility of the organization. An intranet is connected to the internet via a router, which allows the users in other intranets to access the services that it provides. The role of firewalls is to protect an intranet by preventing unauthorized messages leaving or entering. A firewall is implemented by filtering incoming and outgoing messages. The main issues arising in the design of components for use in intranets are file services, firewalls and cost of software installation and support. Figure 1.2 A Typical Intranet 1.2.3 Mobile and ubiquitous computing Technological advances in device miniaturization and wireless networking have led increasingly to the integration of small and portable computing devices into distributed systems. These devices include: • Laptop computers. • Handheld devices, including mobile phones, smart phones, GPS-enabled devices, pagers, personal digital assistants (PDAs), video cameras and digital cameras. • Wearable devices, such as smart watches with functionality similar to a PDA. • Devices embedded in appliances such as washing machines, hi-fi systems, cars and refrigerators. Mobile computing is the performance of computing tasks while the user is on the move, or visiting places other than their usual environment. In mobile computing, users who are away from their ‘home’ intranet (the intranet at work, or their residence) are still provided with access to resources via the devices they carry with them. They can continue to access the Internet; they can continue to access resources in their home intranet; and there is increasing provision for users to utilize resources such as printers or even sales points that are conveniently nearby as they move around. This is also known as location-aware or context-aware computing. Ubiquitous computing is the harnessing of many small, cheap computational devices that are present in users’ physical environments, including the home, office and even natural settings. Ubiquitous and mobile computing overlap, since the mobile user can in principle benefit from computers that are everywhere. But they are distinct, in general. Ubiquitous computing could benefit users while they remain in a single environment such as the home or a hospital. Similarly, mobile computing has advantages even if it involves only conventional, discrete computers and devices such as laptops and printers. Figure 1.3 shows a user who is visiting a host organization. The figure shows the user’s home intranet and the host intranet at the site that the user is visiting. Both intranets are connected to the rest of the Internet. Figure 1.3 Portable and handheld devices in a distributed system 1.2.4 Web search Web search has emerged as a major growth industry in the last decade, with recent figures indicating that the global number of searches has risen to over 10 billion per calendar month. The task of a web search engine is to index the entire contents of the World Wide Web, encompassing a wide range of information styles including web pages, multimedia sources and (scanned) books. Google, the market leader in web search technology, has put significant effort into the design of a sophisticated distributed system infrastructure to support search (and indeed other Google applications and services such as Google Earth). This represents one of the largest and most complex distributed systems installations in the history of computing and hence demands close examination. Highlights of this infrastructure include: • An underlying physical infrastructure consisting of very large numbers of networked computers located at data centres all around the world; • A distributed file system designed to support very large files and heavily optimized for the style of usage required by search and other Google applications (especially reading from files at high and sustained rates); • An associated structured distributed storage system that offers fast access to very large datasets; • A lock service that offers distributed system functions such as distributed locking and agreement; • A programming model that supports the management of very large parallel and distributed computations across the underlying physical infrastructure. 1.2.5 Massively multiplayer online games (MMOGs) Massively multiplayer online games offer an immersive experience whereby very large numbers of users interact through the Internet with a persistent virtual world. Leading examples of such games include Sony’s EverQuest II and EVE Online from the Finnish company CCP Games. The engineering of MMOGs represents a major challenge for distributed systems technologies, particularly because of the need for fast response times to preserve the user experience of the game. Other challenges include the real-time propagation of events to the many players and maintaining a consistent view of the shared world. This therefore provides an excellent example of the challenges facing modern distributed systems designers. A number of solutions have been proposed for the design of massively multiplayer online games: • To support large numbers of clients, the server is a complex entity in its own right consisting of a cluster architecture featuring hundreds of computer nodes. The centralized architecture helps significantly in terms of the management of the virtual world and the single copy also eases consistency concerns. • Other MMOGs adopt more distributed architectures where the universe is partitioned across a (potentially very large) number of servers that may also be geographically distributed. Users are then dynamically allocated a particular server based on current usage patterns and also the network delays to the server. 1.2.6 Financial trading The financial industry has long been at the cutting edge of distributed systems technology with its need, in particular, for real-time access to a wide range of information sources. The industry employs automated monitoring and trading applications. It contains the use of adapters which translate heterogeneous formats into a common internal format. Secondly, the trading system must deal with a variety of event streams, all arriving at rapid rates, and often requiring real-time processing to detect patterns that indicate trading opportunities. This used to be a manual process but competitive pressures have led to increasing automation in terms of what is known as Complex Event Processing (CEP), which offers a way of composing event occurrences together into logical, temporal or spatial patterns. 1.3 RESOURCE SHARING AND THE WEB Many distributed systems can be constructed entirely in the form of interacting clients and servers. The World Wide Web, email and networked printers all are examples of this model. 1.3.1 The World Wide Web The web is an open system. It can be extended and implemented in new ways without disturbing its existing functionality. First, its operation is based on communication standards and document standards that are freely published and widely implemented. Second, the web is open with respect to the types of resource that can be published and shared on it. The web is based on three main standard technological components. 1. The HyperText Markup Language (HTML) is a language for specifying the contents and layout of pages as they are displayed by web browsers. 2. The Uniform Resource Locators (URLs), which identify documents and other resources stored as part of the web. 3. A client server architecture with standard rules for interaction (HTTP) by which browsers and other clients fetch documents and other resources from web servers. HTML The Hyper Text Markup Language (HTML) is used to specify the text and images that make up the contents of a web page, and to specify how they are laid out and formatted for presentation to the user. A web page contains structured items such as headings, paragraphs, tables and images. HTML is also used to specify the links and which resources are associated with them. URLs The Uniform Resource Locator (URL) is used to identify a resource. Every URL has two top level components in its abstract form. Scheme: scheme-specific-identifier The scheme declares the type of the URL. Examples are FTP, HTTP, NTTP etc… An HTTP URLs are the most widely used for accessing resources using the standard HTTP protocol. An HTTP protocol will identify the web server that maintains the resource and identifies required resources at that server. In general HTTP URLs are of the following form: http://servername[:port][/pathname][?query][#fragment] HTTP The Hyper Text Transfer Protocol (HTTP) defines the ways in which browsers and other types of client interact web servers. The Figure 1.4 gives the overview of HTTP. The main features of HTTP are 1. Request Reply Interaction 2. Content Types 3. One resource per request 4. Simple access control. Figure 1.4 Web Servers and Web Browsers. Dynamic Pages A program that web servers run to generate content for their clients is often referred to as Common Gateway Interface (CGI) program. A CGI program may have any application specific functionality, as long as it can parse the arguments that the client provides to it and produce content of the required type. The program will often consult or update a database in processing the request. 1.4 CHALLENGES 1.4.1 Heterogeneity The Internet enables users to access services and run applications over a heterogeneous collection of computers and networks. Heterogeneity (that is, variety and difference) applies to all of the following: • networks • computer hardware • operating systems • programming languages • Implementations by different developers. For example, a computer attached to an Ethernet has an implementation of the Internet protocols over the Ethernet, whereas a computer on a different sort of network will need an implementation of the Internet protocols for that network. Different programming languages use different representations for characters and data structures such as arrays and records. These differences must be addressed if programs written in different languages are to be able to communicate with one another. Programs written by different developers cannot communicate with one another unless they use common standards. Middleware The term middleware applies to a software layer that provides a programming abstraction as well as masking the heterogeneity of the underlying networks, hardware, operating systems and programming languages. The Common Object Request Broker (CORBA), is an example. Some middleware, such as Java Remote Method Invocation (RMI), supports only a single programming language. Most middleware is implemented over the Internet protocols, which themselves mask the differences of the underlying networks, but all middleware deals with the differences in operating systems and hardware. In addition to solving the problems of heterogeneity, middleware provides a uniform computational model for use by the programmers of servers and distributed applications. Possible models include remote object invocation, remote event notification, remote SQL access and distributed transaction processing. Heterogeneity and Mobile Code The term mobile code is used to refer to program code that can be transferred from one computer to another and run at the destination – Java applets are an example. The virtual machine approach provides a way of making code executable on a variety of host computers: the compiler for a particular language generates code for a virtual machine instead of a particular hardware order code. 1.4.2 Openness The openness of a computer system is the characteristic that determines whether the system can be extended and reimplemented in various ways. The openness of distributed systems is determined primarily by the degree to which new resource-sharing services can be added and be made available for use by a variety of client programs. Openness cannot be achieved unless the specification and documentation of the key software interfaces of the components of a system are made available to software developers. The designers of the Internet protocols introduced a series of documents called ‘Requests For Comments’, or RFCs, each of which is known by a number. This series includes discussions as well as the specifications of protocols. RFCs are not the only means of publication. For example, the World Wide Web Consortium (W3C) develops and publishes standards related to the working of the Web. Systems that are designed to support resource sharing in this way are termed open distributed systems to emphasize the fact that they are extensible. They may be extended at the hardware level by the addition of computers to the network and at the software level by the introduction of new services and the reimplementation of old ones, enabling application programs to share resources. • Open systems are characterized by the fact that their key interfaces are published. • Open distributed systems are based on the provision of a uniform communication mechanism and published interfaces for access to shared resources. • Open distributed systems can be constructed from heterogeneous hardware and software, possibly from different vendors. But the conformance of each component to the published standard must be carefully tested and verified if the system is to work correctly. 1.4.3 Security Many of the information resources that are made available and maintained in distributed systems have a high intrinsic value to their users. The security is a considerable importance. Security for information resources has three components: confidentiality (protection against disclosure to unauthorized individuals), integrity (protection against alteration or corruption), and availability (protection against interference with the means to access the resources). Denial of Service Attacks The security problem is that a user may wish to disrupt a service for some reason. This can be achieved by bombarding the service with such a large number of pointless requests that the serious users are unable to use it. This is called a denial of service attack. There have been several denial of service attacks on well-known web services. Currently such attacks are countered by attempting to catch and punish the perpetrators after the event, but that is not a general solution to the problem. Countermeasures based on improvements in the management of networks are under development. Security of Mobile Code Mobile code needs to be handled with care. If an executable program as an electronic mail attachment has received then the possible effects of running the program are unpredictable; for example, it may seem to display an interesting picture but in reality it may access local resources, or perhaps be part of a denial of service attack. 1.4.4 Scalability The design of scalable distributed systems presents the following challenges: Controlling the cost of Physical Resources As the demand for a resource grows, it should be possible to extend the system, at reasonable cost, to meet it. For example, the frequency with which files are accessed in an intranet is likely to grow as the number of users and computers increases. It must be possible to add server computers to avoid the performance bottleneck that would arise if a single file server had to handle all file access requests. In general, for a system with n users to be scalable, the quantity of physical resources required to support them should be at most O(n) – that is, proportional to n. Controlling the Performance Loss Consider the management of a set of data whose size is proportional to the number of users or resources in the system – for example, the table with the correspondence between the domain names of computers and their Internet addresses held by the Domain Name System, which is used mainly to look up DNS names such as www.amazon.com. Algorithms that use hierarchic structures scale better than those that use linear structures. But even with hierarchic structures an increase in size will result in some loss in performance: the time taken to access hierarchically structured data is O(log n), where n is the size of the set of data. For a system to be scalable, the maximum performance loss should be no worse than this. Preventing Software Resources Running out For the reason of lack of scalability reason, a new version of the protocol with 128-bit Internet addresses is being adopted, and this will require modifications to many software components. To be fair to the early designers of the Internet, there is no correct solution to this problem. It is difficult to predict the demand that will be put on a system years ahead. Moreover, overcompensating for future growth may be worse than adapting to a change. Avoiding Performance bottlenecks In general, algorithms should be decentralized to avoid having performance bottlenecks. We illustrate this point with reference to the predecessor of the Domain Name System, in which the name table was kept in a single master file that could be downloaded to any computers that needed it. That was fine when there were only a few hundred computers in the Internet, but it soon became a serious performance and administrative bottleneck. The Domain Name System removed this bottleneck by partitioning the name table between servers located throughout the Internet and administered locally. 1.4.5 Failure Handling Failures in a distributed system are partial – that is, some components fail while others continue to function. Therefore the handling of failures is particularly difficult. The following are the techniques for dealing with failures. Detecting Failures Some failures can be detected. For example, checksums can be used to detect corrupted data in a message or a file. Masking Failures Some failures that have been detected can be hidden or made less severe. Two examples of hiding failures: 1. Messages can be retransmitted when they fail to arrive. 2. File data can be written to a pair of disks so that if one is corrupted, the other may still be correct. Tolerating Failures Most of the services in the Internet do exhibit failures – it would not be practical for them to attempt to detect and hide all of the failures that might occur in such a large network with so many components. Their clients can be designed to tolerate failures, which generally involve the users tolerating them as well. For example, when a web browser cannot contact a web server, it does not make the user wait forever while it keeps on trying – it informs the user about the problem, leaving them free to try again later. Recovery from Failures Recovery involves the design of software so that the state of permanent data can be recovered or ‘rolled back’ after a server has crashed. In general, the computations performed by some programs will be incomplete when a fault occurs, and the permanent data that they update (files and other material stored in permanent storage) may not be in a consistent state. Redundancy Services can be made to tolerate failures by the use of redundant components. Consider the following examples: 1. There should always be at least two different routes between any two routers in the Internet. 2. In the Domain Name System, every name table is replicated in at least two different servers. 3. A database may be replicated in several servers to ensure that the data remains accessible after the failure of any single server; the servers can be designed to detect faults in their peers; when a fault is detected in one server, clients are redirected to the remaining servers. The design of effective techniques for keeping replicas of rapidly changing data up to date without excessive loss of performance is a challenge. Distributed systems provide a high degree of availability in the face of hardware faults. The availability of a system is a measure of the proportion of time that it is available for use. When one of the components in a distributed system fails, only the work that was using the failed component is affected. A user may move to another computer if the one that they were using fails; a server process can be started on another computer. 1.4.6 Concurrency Both services and applications provide resources that can be shared by clients in a distributed system. There is therefore a possibility that several clients will attempt to access a shared resource at the same time. For example, a data structure that records bids for an auction may be accessed very frequently when it gets close to the deadline time. The process that manages a shared resource could take one client request at a time. But that approach limits throughput. Therefore services and applications generally allow multiple client requests to be processed concurrently. To make this more concrete, suppose that each resource is encapsulated as an object and that invocations are executed in concurrent threads. In this case it is possible that several threads may be executing concurrently within an object, in which case their operations on the object may conflict with one another and produce inconsistent results. For an object to be safe in a concurrent environment, its operations must be synchronized in such a way that its data remains consistent. This can be achieved by standard techniques such as semaphores, which are used in most operating systems. 1.4.7 Transparency Transparency is defined as the concealment from the user and the application programmer of the separation of components in a distributed system, so that the system is perceived as a whole rather than as a collection of independent components. The implications of transparency are a major influence on the design of the system software. • Access transparency enables local and remote resources to be accessed using identical operations. • Location transparency enables resources to be accessed without knowledge of their physical or network location (for example, which building or IP address). • Concurrency transparency enables several processes to operate concurrently using shared resources without interference between them. • Replication transparency enables multiple instances of resources to be used to increase reliability and performance without knowledge of the replicas by users or application programmers. • Failure transparency enables the concealment of faults, allowing users and application programs to complete their tasks despite the failure of hardware or software components. • Mobility transparency allows the movement of resources and clients within a system without affecting the operation of users or programs. • Performance transparency allows the system to be reconfigured to improve performance as loads vary. • Scaling transparency allows the system and applications to expand in scale without change to the system structure or the application algorithms.
Share this article :

Post a Comment

 
Support : Creating Website | Johny Template | Mas Template
Copyright © 2011. Amazing Picker - All Rights Reserved
Template Created by Creating Website Published by Mas Template
Proudly powered by Blogger