From socrates sid
Answered By Jim Dennis
What are concurrent processes how they work in distributed and shared systems?Can they be executed parallel or they just give the impression of running parallel.
[JimD]
"concurrent processes" isn't a special term of art. A process is a program running on a UNIX/Linux system, created with fork() (a special form of the clone() system call under Linux). A process has it's own (virtual) memory space. Under Linux a different form of the clone() system call creates a "thread" (specifically a kernel thread). Kernel threads have their own process ID (PIDs) but share their memory with other threads in their process.
There are a number of technical differences between processes and kernel threads under Linux (mostly having to do with signal dispatching). The gist of it is that a process is a memory space and a scheduling and signal handling unit; while a kernel thread is just a scheduling and signal handling unit. Processes also have their own security credentials (UIDs, GIDs, etc) and file descriptors. Kernel threads share common identity and file descriptor sets.
There are also "psuedo-threads" (pthreads) which are implemented within a process via library support; psuedo-threads are not a kernel API, and a kernel need not have any special support for them. The main differences betwen kernel threads and pthreads have to do with blocking characteristics. If a pthread makes a "blocking" form of a system call (such as the read() or write()) then the whole process (all threads) can be blocked. Obviously the library should provide support to help the programmer avoid doing these things; there used to be separate thread aware (re-entrant) versions of the C libraries to link against pthreads programs under Linux. However, all recent versions of glibc (the GNU C libraries used by all mainstream Linux systems) are re-entrant and have clearly defined thread-safe APIs. (In some cases, like strtok() there are special threading versions which must be used explicitly --- due to some historical interactions between those functions and certain global variables).
Kernel threads can make blocking system calls as appropriate to their needs -- since other threads in that process group will still get time slices scheduled to them independently.
Other parts of your question (which appears to be a lame "do my homework" posting, BTW) are too vague and lack sufficient context to answer well.
For example: Linux is not a "distributed system." You can build distributed systems using Linux --- by providing some protocol over any of the existing communications (networking and device interface) mechanisms. You could conceivably implement a distributed system over a variety of different process, kernel thread, and pthread models and over a variety of different networking protocols (mostly over TCP/IP, and UDP, but also possible using direct, lower level, ethernet frames; or by implementing custom protocols over any other device).
- (I've heard of a protocol that was done over PC parallel parts; limited bandwidth but very low latencies! Reducing latency is often far more important in tightly coupled clusters than bandwidth).
So, the question:
What are concurrent processes how they work in distributed and shared systems?
... doesn't make sense (even if we ignore the poor grammar). I also don't know what a "shared system" is. It is also not a term of art.
On SMP (symmetrical multiprocessor) systems the Linux kernel initializes all available CPUs (processors) and basically let's them compete to run processes. Each CPU, at each 10ms context switch time scans the run list (the list of processes and kernel threads which are ready to run --- i.e. not blocked on I/O and not waiting or sleeping) and grabs a lock on it, and runs it for awhile. It's actually considerably more complicated than that --- since there are features that try to implement "processor affinity" (to insure that processes will tend to run on the same CPU from one context switch to another --- to take advantage of any L1 cache lines that weren't invalidated by the intervening processes/threads) and many other details.
However, the gist of this MP model is that processes and kernel thread may be executing in parallel. The context switching provides the "impression" (multi-tasking) that many processes are running "simultaneously" by letting each to a little work, so in aggregate they've all done some things (responded) on any human perceptible time scale.
Obviously a "distributed" system has multiple processors (in separate systems) and thus runs processes on each of those "nodes" -- which is truly parallel. An SMP machine is a little like a distributed system (cluster of machines) except that all of the CPUs share the same memory and other devices. A NUMA (non-uniform memory access) system is a form of MP (multi-processing) where the CPUs share the same memory --- but some of the RAM (memory) is "closer" to some CPUs than to others (in terms of latency and access characteristics. In other words the memory isn't quite as "symmetrical." (However, an "asymmetric MP" system would be one where there are multiple CPUs that have different functions --- some some CPUs were dedicated to some sorts of tasks while other CPUs performs other operations. In many ways a modern PC with a high end video card is an example of an asymmetrical MP system. A modern "GPU" (graphical processing unit) has quite a bit of memory and considerable processor power of its own; and the video drivers provide ways for the host system to offload quite a bit of work (texturing, polygon shifting, scaling, shading, rotations, etc) unto the video card. (To a more subtle degree the hard drives, sound cards, ethernet and some SCSI, RAID, and firewired adapters, in a modern PC are other examples of asymmetric multi-processing since many of these have CPUs, memory and programs (often in firmware, but sometimes overridden by the host system. However, that point is moot and I might have to debate someone at length to arrive at a satisfactory distinction between "intelligent peripherals" and asymmetric MP. In general the phrase "asymmetric multi-processing" is simply not used in modern computing; so the "S" in "SMP" seems to be redundant).
Meet the Gang 1 2 3 4 |