300+ [UPDATED] Distributed Computing Interview Questions

1. Define Distributed System?

A distributed system is a collection of independent computers that appears to its users as a single coherent system. A distributed system is one in which components located at networked communicate and coordinate their actions only by passing message.

2. List The Characteristics Of Distributed System?
- Programs are executed concurrently
- There is no global time
- Components can fail independently (isolation, crash)

3. Mention The Examples Of Distributed System?
- The Internet
- Intranets
- Mobile and ubiquitous computing

4. What Is Mobile And Ubiquitous Computing?

Mobile:
Computing devices are being carried around.

Ubiquitous:
Little computing devices are all over the place.

5. Mention The Challenges In Distributed System?
1. Heterogeneity
2. Openness
3. Security
4. Scalability
5. Failure handling
6. Concurrency
7. Transparency

6. What Are The Advantages Of Distributed Systems?
- Performance
- Distribution
- Reliability (fault tolerance)
- Incremental growth
- Sharing of data/resources
- Communication

7. What Are The Disadvantages Of Disadvantages Of Distributed Systems?
- Difficulties of developing distributed software
- Networking problems
- Security problems

8. Write The Difference Between Mobile And Ubiquitous Computing?
- Ubiquitous computing used in single environment such as home or hospital.
- Mobile computing has advantage when using different devices such as laptops and printers.

9. Why We Need Openness?

The degree to which a computer system can be extended and re-implemented.

IEEE = Institute of Electrical and Electronic Engineers

e.g., IEEE 802.11 WLAN, IEEE 802.3 Ethernet

W3C = World Wide Web Consortium

e.g., HTML Recommendations

10. What Is The Security Mechanisms Used In Distributed Computing?

Encryption:

E.g. Blowfish, RSA

Authentication:

E.g. password, public key authentication

Authorization:

E.g. access control lists

11. How We Provide A Security?

Confidentiality:

Protection against disclosure to unauthorized individual.

E.g. ACLs (access control lists) to provide authorized access to information.

Integrity:

Protection against alternation or corruption.

E.g. changing the account number or amount value in a money order

Availability:

Protection against interference targeting access to the resources.

E.g. denial of service (DoS, DDoS) attacks

Non-repudiation:

Proof of sending / receiving an information.

E.g. digital signature

12. Define Scalability?

System should work efficiently at many different scales, ranging from a small Intranet to the Internet.

Challenges of designing scalable distributed systems:
- Cost of physical resources
- Cost should linearly increase with system size
- Performance Loss
For example, in hierarchically structured data, search performance loss due to data growth should not be beyond O (log n), where n is the size of data.

Preventing software resources running out.

Numbers used to represent Internet address (32 bit->64bit), Y2K like problem. Avoiding performance bottlenecks.

Use decentralized algorithms (centralized DNS to decentralized).

13. What Are The Different Types Of System Model?
- Architecture model
- Fundamental model
- Interaction model
- Failure model
- Security model

14. What Is The Use Of Middleware?

Middleware a layer of software whose purpose is to mask heterogeneity and to provide a convenient programming model to application programmers. Middleware is represented by processes or objects in a set of computers that interact with each other to implement communication and resource sharing support for distributed applications.

15. Define Protocol?

The term protocol is used to refer to a well-known set of rules and formats to be used for communication between processes in order to perform a given task.

The definition of a protocol has two important parts to it:
- A specification of the sequence of messages that must be exchanged;
- A specification of the format of the data in the messages.

16. What Is Meant By Internet Protocol?

The IP protocol transmits datagram from one host to another, if necessary via intermediate routers. There are several header fields that are used by the transmission and routing algorithms.

17. Define Mobile Ip?

Mobile IP is an Internet Engineering Task Force (IETF) standard communications protocol that is designed to allow mobile device users to move from one network to another while maintaining their permanent IP address. Defined in Request for Comments (RFC) 2002, Mobile IP is an enhancement of the Internet Protocol (IP) that adds mechanisms for forwarding Internet traffic to mobile devices (known as mobile nodes) when they are connecting through other than their home network.

18. What Is The Architectural Model?

An architectural model defines the way in which the components of system interact with one another and the way in which they are mapped onto an underlying network of computers.

19. What Is The Fundamental Model?

Fundamental models that help to reveal key problems for the designers of distributed systems. Their purpose is to specify the design issues, difficulties and threats that must be resolved in order to develop distribute systems that fulfill their tasks correctly, reliable and secure. The fundamental mode provides abstract views of just those characteristics of distributed systems that affect the dependability characteristics – correctness, reliability and security.

20. Write About The Parts Available In Routing Algorithm?

Routing algorithm has two parts:
1. It must make decisions that determine the route taken by each packet as it travels through the network. In circuit-switched network layers such as X.25 and frame relay networks such as ATM the route is determined whenever a virtual circuit or connection is established.
2. In packet-switched network layers such as IP it is determined separately for each packet, and the algorithm must be particularly simple and efficient if it is not to degrade network performance.
3. It must dynamically update its knowledge of the network based on traffic monitoring and the detection of configuration changes or failures. This activity is less time-critical; slower and more computation-intensive techniques can be used.

21. What Is Meant By Inter Process Communication?

Inter process communication is concerned with the communication between processes in a distributed system, both in its own right and as support for communication between distributed objects. The Java API for inter process communication in the internet provides both datagram and stream communication.

22. What Is The Difference Between Rmi And Rpc?

Remote Procedure Call or the RPC and the Remote Method Invocation or RMI are both message passing techniques in the Inter Process Communication (IPC).

But there are two basic differences between the two methods:
1. RPC supports procedural programming. I.e. only remote procedures can be invoked. Whereas RMI is object-based. As the name suggests, it is invoked on remote objects.
2. In RPC, the parameters that are passed are ordinary data structures. Whereas in RMI, objects can be passed as parameters.

23. Define Datagram?

A datagram is, to quote the Internet’s Request for Comments 1594, “a self-contained, independent entity of data carrying sufficient information to be routed from the source to the destination computer without reliance on earlier exchanges between this source and destination computer and the transporting network.” The term is used in several well-known communication protocols, including the User Datagram Protocol and AppleTalk.

24. What Is The Use Of Udp?

The Domain Name Service (DNS), which looks up DNS names in the Internet, is implemented over UDP. UDP datagram’s are sometimes an attractive choice because they do not suffer from overheads associated with guaranteed message delivery.

25. What Is Meant By Client Server Communication?

The client–server model of computing is a distributed application structure that partitions tasks or workloads between the providers of a resource or service, called servers, and service requesters, called clients.

26. What Is Meant By Group Communication?

Group communication is a multicast operation is more appropriate- this is an operation that sends a single message from one process to each of the members of a group of process, usually in such a way that the membership of the group is transparent to the sender.

27. What Is The Use Of Rmi Registry?

The RMI registry is used to store a list of available services. A client uses the registry to make it’s proxy object, and the Registry is responsible for giving appropriate information to the client so that it can hook up with the server that implements the service.

28. Difference Between Synchronous And Asynchronous Communication?

In synchronous form of communication, the sending and receiving processes synchronize at every message. In this case, both send and receive are blocking operations. Whenever a send is issued the sending process is blocked until the corresponding receive is issued. Whenever receive is issued, the process blocks until a message arrives.

In asynchronous form of communication, the use of the send operation is non-blocking in that the sending process is allowed to proceed as soon as the message has been copied to a local buffer and the transmission of the message proceeds in parallel with the sending process. The receive operation can have blocking and non-blocking variants.

29. What Is Marshalling And Unmarshalling?

Marshalling is the process of taking a collection of data items and assembling them into a form suitable for transmission in a message. Unmarshalling is the process of disassembling them on arrival to produce an equivalent collection of data items at the destination.

30. What Is Cdr?

CORBA CDR is the external data representation defined with CORBA 2.0. CDR can represent all of the data types that can be used as arguments and return values in remote invocation in CORBA. It consists of 15 primitive types that include short (16-bit), long (32-bit), unsigned short, unsigned long, float (32-bit), double (64-bit), char, Boolean (TRUE or FALSE), octet (8-bit) and any constructed types.

31. Define Xml?
- XML stands for Extensible Markup Language.
- XML is a markup language much like HTML.
- XML was designed to carry data, not to display data.
- XML tags are not predefined. You must define your own tags.
- XML is designed to be self-descriptive.
- XML is a W3C Recommendation.

32. Define Operating System?

An Operating System is the layer between the hardware and software.

An Operating System is responsible for the following functions:
- Device management using device drivers
- Process management using processes and threads
- Interprocess communication
- Memory management
- File systems.

33. List The Core Os Components With Diagram?

Process manager:
Handles the creation of and operations upon processes. A process is a unit of resource management, including an address space and one or more threads.

Thread manager:
Thread creation, synchronization and scheduling. Threads are scheduled activities attached to processes.

Communication manager:
Communication between threads attached to different processes on the same computer. Some kernels also support communication between threads in remote processes. Other kernels have no notion of other computers built into them, and an additional service is required for external communication.

Memory manager:
Management of physical and virtual memory.

Supervisor:
Dispatching of interrupted, system call traps and other exceptions: control of memory management unit and hardware caches; processor and floating point unit register manipulations. This is known as the Hardware Abstraction Layer in Windows NT.

34. How Kernel Uses The Address Space?

Often the kernel code and data are mapped into every address space at the same location. When a process makes a system call or an exception occurs, there is no need to switch to a new set of address mappings.

35. What Is System Call Trap? How It Is Implemented?

The invocation mechanism for resources managed by the kernel A system call trap is implemented by a machine-level TRAP instructions, which puts the processor into supervisor mode and switches to the kernel address space.

36. What Is Execution Environment? What It Contains?

An execution environment is the unit of resource management: a collection of local kernel-managed resources to which its threads have access.

An execution environment primarily consists of:
- An address space;
- Thread synchronization and communication resources such as semaphores and communication interfaces (for example sockets);
- Higher-level resources such as open files and windows.

37. List The Two Types Of Thread Scheduling? Explain?

There are two types of thread scheduling.

They are:
1. Preemptive scheduling:
  A thread may be suspended at any point to make way for another thread, even when the preempted thread would otherwise continue running.
2. Non-preemptive scheduling:
  A thread runs until it makes a call to the threading system, when the system may de-schedule it and schedule another thread to run.

38. List The Types Of Event That The Kernel Notified To The User Level Scheduler?

There are four types of event that the kernel notified to the user level scheduler are:
- Virtual processor allocated.
- Scheduler Activation blocked
- Scheduler Activation unblocked
- Scheduler Activation preempted

39. Difference Between Monolithic And Micro Kernel?
- The microkernel based OS can provide ability to enforce modularity behind memory protection boundaries.
- Microkernel-based OS, the number of bugs is less than the monolithic based system.
- The monolithis-based OS can provide more efficiency with which operations can be invoked.

40. Write A Note On Lrpc With Diagram?
- An invocation of two processes on the same machine is called as light weight Remote Procedure Call.
- Shared memory regions are efficient for client server communication with a different region between the server and each of its local clients.
- The same stack is used by client and server stub.

41. What Is The Goal Of Security? List The Three Broad Classes Of Security Threats?

The main goal of security is to restrict access to information and resources to just those principles that are authorized to have success.

Security threads fall into three broad classes:
- Leakage
- Tampering
- Vandalism

42. What Are The Two Measures Taken By Jvm To Protect The Local Environment?

The two measures taken by JVM to protect the local environment are:
1. The downloaded classes are stored separately from local classes ,preventing them from replacing local classes with spurious versions .
2. The byte codes are checked for validity. Valid java byte code is composed of java virtual machine instructions from a specified set. The instructions are also checked to ensure that they will not produce certain errors when the program runs such as accessing illegal memory addresses.

43. What Is Cryptography? What Is The Use Of It?

Cryptography is the art of encoding information in a format that only the intended recipients can access.

Uses:
- Secrecy and integrity
- Authentication
- Digital signatures.

44. Write A Note On Digital Signature?

Requirement:
- To authenticate stored document files as well as messages
- To protect against forgery
- To prevent the signer from repudiating a signed document (denying their responsibility)
Encryption of a document in a secret key constitutes a signature:
- Impossible for others to perform without knowledge of the key
- Strong authentication of document
- Strong protection against forgery
- Weak against repudiation (signer could claim key was compromised).

45. What Are Credentials?
- Credentials are a set of evidence provided by a principal when requesting access to a resource.
- It is convenient to require users to interact with the system and authenticate themselves each time their authority is required to perform an operation on a protected resources.

46. Write A Note On X.500 Directory Service?

It is a directory service. It can be in the same way as a conventional name service but it is primarily used to satisfy descriptive queries, designed to discover the names and attributes of other users or system resources.

47. What Is Name Space?

It is a collection of all valid names recognized by a particular service. For a name to be valid means that the service will attempt to look it up even though that name may prove not to correspond to any object-to be unbound.

Example:
The name “Two” could not possibly be the name of UNIX process, whereas the integer “2’ might be.

48. What Is The Use Of Iterative Navigation?

DNS supports the model known as iterative navigation. To resolve a name, a client presents the name to the local name server, which attempts to resolve it. If the local name server has the name, it returns the result immediately.

49. Define Multicast Navigation?

A client multicast the name to be resolved and the required object type to the group of name servers. Only the server that holds the named attributes responds to the request.

50. Write Short Notes On Directory Services?

A service that stores that stores collection of binding between names and attributes and that looks up entries that match attribute based specification is called directory service.

Example:
Microsoft’s active Directory services, X.500 and its cousin LDAP, Universe.

51. What Is Clock Skew And Clock Drift?

The instantaneous difference between the readings of any two clocks is called their skew.

Clock drift means that they count time at different rates, and so diverge.

52. What Are The Two Modes Of Synchronization? Write Their Format?

The two modes are:
1. External synchronization:
  For a synchronization bound D>0, and for a source S of UTC time, |S (t) –Ci (t)|
2. Internal synchronization:
  For a synchronization bound D>0,|Ci(t)-Cj(t)|

53. How The Clock Synchronization Done In Christian’s Method?

A single time server might fail, so they suggest the use of a group of synchronized servers.

It does not deal with faulty servers.

54. List The Design Aims And Features Of Ntp?
- To provide a service enabling client across the internet to be synchronized accurately to UTC.
- To provide a reliable service that can be survive lengthy losses of connectivity.
- To enable clients to resynchronize sufficiently frequently to offset the rates of drift found in most computers.
- To provide protection against interference with the time service whether malicious or accidental.

55. With Example Explain About Happened-before Relation?

The happened-before relation is a partial order on events that reflects a flow of information between them.

56. Write The Rules For Updating The Clocks?

The rules for updating the clocks are:
- Initially, VI[j] =0, for i, j=1, 2…N
- Just before pi timestamps an event, it sets VI[i]:=VI[i] +1.
- Pi includes the value t=VI in every message it sends.
- When pi receives a timestamp t in a message, it sets Vi[j] ;=max(Vi[j],t[j]),for j=1,2..N.

57. What Are The Issues Resolved By Berkeley’s Algorithm?

The collections of computers whose clocks are to be synchronized are categorized as masters and slaves. The averaging of the clock values cancels out the individual’s clocks tendencies to run fast or slow.

This overcomes the uncertainty due to message transmission time introduced in the synchronized clock values returned by the master.

58. What Is Network Partition?

The network partition can be used to separate a group of replica managers into two or more sub groups.

The members of same subgroup communicate with one another but members of different subgroup cannot communicate with one another.

59. Difference Between Reliable And Unreliable Failure Detector?

Reliable failure detector is one that is always accurate n detecting a process failure. It answers processes queries with either a response of unsuspected-which, as before can only be a hint-or failed.

U Reliable failure detector may produce one of two values when given the identity of a process: Unsuspected or suspected. Both of these results are hints, which may or may not accurately reflect whether the process has actually failed.

60. Define Election Algorithm? Mention The Different Algorithm?

An algorithm for choosing a unique process to play a particular role is called an election algorithm.

Ex:
In a variant of central server algorithm for mutual exclusion, the server is chosen from among the process.

The different algorithms are:
- Ring based election algorithm
- Bully algorithm.

61. Define Multicast Communication?

It is the implementation of group communication .Multicast communication requires coordination and agreement. The aim is for members of a group to receive copies of messages sent to the group. Many different delivery guarantees are possible

Example:
Agree on the set of messages received or on delivery ordering.

62. List The Requirements Of Consensus Algorithm To Hold For Execution?

The requirements of consensus algorithm to hold for execution are:
- Termination
- Agreement
- Integrity.

63. Write A Note On Bully Algorithm?

This algorithm allows process to crash during an election. Although it assumes that message delivery between processes is reliable. It assumes that the system is synchronous – it uses timeouts to detect a process failure.

64. What Is Distributed Debugging?

Distributed debugging is nothing but to check whether a transitory state, instead of a stable state has occurred in an actual execution.

This is done by recording a system global state.