Fault Tolerant System for Embedded System Architecture

: This paper is a study based paper of an accession to the conception fault-tolerant embedded systems architecture in which we mingle the hardware and software fault resistance methods. There is an efficient interpose between crystallization in hardware and system reimplementation in software to have a maximum fault tolerance system with lowest possible cost possible. This paper is an approach towards the building of such a fault tolerant system which has maximum crystallization in hardware and possible attempts of execution at software level at a minimum of possible cost. This system should have maximum possible reliability characteristics which can satisfy all the conditions to have a fault tolerant error free system. In the reference, we operate re-execution concept in operations to avoid or tolerate temporary fault like transient faults in software and this is called time redundancy concept.


I. Fault Model
Types of Fault Transient Fault: Transient fault is the fault which arises once and at the couple of seconds it disperses. This means that the fault dissipate due to some environmental conditions. Intermittent Fault: Intermittent Fault is the fault which arises and become invisible again appears but does not follow any peculiar path. This category of fault is worst kind of fault because it is difficult to find. This fault may appear due to present of poor connections in the circuitry. Permanent Fault: Permanent fault is the fault if it appears then there is the only one solution as removal or reinstatement of the components so it can perform longer. This type of Fault if occurs then only the replacement or repair of a faulty component will allow the defected system to function normally. Means the fault persist until it is removed, repair or replaced by another component. A fault can be 1. Hardware fault: It includes glitches at hardware level. Hardware level includes components like processor communication line, switch, etc. 2. Software fault: it includes faults at software level. There is a combined method named as design diversity which combines hardware and software fault-tolerance. This approach implements such a computer which includes many different redundant components at different channels. Each channel is designed for some specifications and the approach detects the faulty channel if it cannot perform according to its specifications. The main aim is to provide a system with both software and hardware level fault tolerant system. II. Redundancy: A. Time redundancy: Time redundancy is the concept in which the programming is set to rerun the programmed until the faulty circuit is not recover. B. Hardware redundancy: Hardware redundancy provides more hardware circuitry in the system to make the system more redundant. C. Software redundancy: The basic concept of the software redundancy is to provide distinct software with distinct versions; if one version fails another can handle the rest of the system without the failure of whole system. D. Information redundancy: Information redundancy provides certain data codes for the system and if the system found error in bits it can be detected or corrected by the same data codes. These are of many types like parity coding, checksum codes, cyclic codes.

III. Hardware Recovery
The main aspect of fault tolerant design is provided with default programming computers for recovery. If random faults appear in the system it will automatically recover the system. A. Passive (static): Passive recovery provides fault masking for fault recovery. In this no extra effort is necessary by the system. B. Active (dynamic): The main concept of active recovery is to compare of results for detection of faults. It directly removes the faulty components from the system. C. Hybrid: It combines both approaches active and passive until detection is not complete. It is much expensive but more reliable technique for making system fault tolerant. IV. Hardware redundancy uses: Basically hardware redundancy is to provide extra parts or hardware components to the system for making system more redundant. A. Fault detection, correction, and masking: There is the number of hardware components provided for the same specific task in the parallel session and by the end of the session results can compare to remove fault. B. Detection: A system shows disagreement in the result if a part of the system is faulty but rest part of the system is performing well .This is the best way to detect fault in the system by its actual result variation. C. Correction and masking: If a small portion of the system is faulty and the large one is error free then the majority result can be used to make system fault free. Replacement of malfunctioning units: Correction and masking are the limited features for the system to make it fault tolerant. In these the level of fault tolerance is defined by restore the actual system or replacement of the faulty components. 1. Mistakes in specification or design: these mistakes can at both level in hardware and software. 2. Defects in components: Hardware faults are occurring due to defects at manufacturing process. 3. Operating environment: hardware faults can be the cause inimical surroundings like: temperature, radiation, vibration, etc.

Figure1: Embedded system design and Development
Embedded system can be design for fault tolerance system in VLSI by blending the design process.

VI. Fault Tolerance Techniques in Embedded Systems Architecture
A. The fault diagnosis and fault tolerance are part of the software architecture of the system.. B. The software architecture, mainly include the process of Re-execution. If a system found faulty firstly the fault is discovered then the process will re-executed. C. Re-execution process restores all the initial inputs to remove fault. D. We use following two techniques for tolerating faults at software level. E. Rollback Recovery with Check pointing VII. Active replication. A. Rollback Recovery with Check pointing Rollback recovery technique reduces re-execution time to detect fault in the system. The approach is based on to find the best and last error Free State in the execution and then restores that sate to prevent whole system failure. The last fault free state or the check point has to be saved in static memory for backup so that if the system occur any failure it can be resorted by the static memory thus the system can be prevent by the failure.
B. Active and Passive Replication The main disadvantage of recovery method is that they are restricted to analyze redundant scope of applicable computation nodes. On the other side roll back recovery, re-execution, and active and passive replication methods can handle redundant scope basically replication provide distinct parallel session of the system to recover the fault present in the system. C. Transparency Transient faults can be adjusted only by the dynamically approach to prevent from the system failure. The system operations are proportionally equal to the number of tolerated transient faults. In reference to correct, test, or adjust the circuit all its processing specifications have to be taken into narration. Therefore, debugging, verification and testing become very difficult. An effective solution of this difficulty is transparency. A

IX.
CONCLUSION AND FUTURE SCOPE The best advantage of tools and experience can be observed in the single stuck-at fault model. Stuck-at fault tests cover many other faults like multiple stuck, stuck-open and bridging. .Special tests are required by technology dependent faults, delay faults and stuck-short faults. Different types of test models are present in the testing world to analyze fault tolerant model especially in analog and memory circuits. A system can be a fault tolerant system if it is designed for the specific test sets. As new advancements of the circuits, the system requires new approaches and technologies for fault testing and diagnosis. There are many varieties of the circuit testing due to the combination of hardware and software of the system. Thus this is the advance research topic to make a fault tolerant circuit with the combination of hardware and software circuit which is called embedded system architecture. There is a novel approach to handle fault tolerant system with maximum fault toleration degree to stick with stuck at fault models. Fault-tolerant research already present in every field as in control development, transportation, electronic commerce, space, communications and many other areas that truly bang our lives.