Fault Management


The Fault Management function is defined as the set of processes used to efficiently repair a network fault detected in the surveillance function. This includes the launching of relevant technical resources to assist in the repair of a fault condition.

This function is driven by processes that define how a fault is to be handled.  The main categories on how a fault is managed are:
  1. Service affecting - Focus on service restoration - immediate and continuous response.
    • Call out to technician / engineer to assist with fault analysis and service restoration.
    • Call out of field resources.
    • Manage vendor support (e.g. facilitate communications between field staff and vendors).
  2. Service at risk - Focus on service continuity while controlling costs and managing the risk.
    • E.g. loss of system redundancy -> call out business hours to repair.
    • Rule based.
  3. Non-service affecting - Focus on repairing or resolving non-critical faults in a cost effective manner.
    • E.g. building infrastructure repairs Log fault and track resolution.