事件与事故差别(备忘)
学习资料: ITIL培训基地专家讲堂直播 300期视频回放
词汇表解释:
Event
(Service Operation) A change of state which has significance for the management of a Configuration Item or IT Service.
The term Event is also used to mean an Alert or notification created by any IT Service, Configuration Item or Monitoring tool. Events typically require IT Operations personnel to take actions, and often lead to Incidents being logged.
Event Management
(Service Operation) The Process responsible for managing Events throughout their Lifecycle. Event Management is one of the main Activities of IT Operations.
Incident
(Service Operation) An unplanned interruption to an IT Service or a reduction in the Quality of an IT Service. Failure of a Configuration Item that has not yet impacted Service is also an Incident. For example Failure of one disk from a mirror set.
Incident Management
(Service Operation) The Process responsible for managing the Lifecycle of all Incidents. The primary Objective of Incident Management is to return the IT Service to Users as quickly as possible.
网上详解文章(带举例):
What is the difference between Event and Incident Management? Most of the time, whenever I attend ITIL seminars or ITIL exam courses (specially the ones that touch based with Event and Incident Management), this is one of the questions that were most commonly asked. Moreover, the discussion gets prolonged. So today, I decided that I will blog about the difference between Event Management and Incident Management.
Keep in mind that within the realm of this discussion are the factors that can help you differentiate the two. The most that you can take away from this article is better understan** of Event and Incident Management that will certainly be of help in your ITIL certification exam. At the very least, it can be handy when explaining the differences between an Event and an Incident to your colleagues or ITIL implementation sponsor(s). The exam will certainly include questions meant to trick you between Event and Incident Management. That is only if you do not have clear distinction between Event and Incident Management.
But before we go forward, let us first clearly define what is an Event and what is an Incident. Let us refer to the OGC’s ITIL V3 Service Operation book for the definition of an Event and an Incident. Accor** to OGC’s book, an event is a change of the state that has significance for the management of a Configuration Item or IT Service. Whereas, an Incident is an unplanned interruption to an IT Service or reduction in the Quality of an IT Service is also an Incident.
There you go, key terminologies has been properly defined and now let us deal with the differences of these two.
Event Management
Accor** to the OGC’s ITIL V3 Service Operation book, Event Management monitors all events that occur throughout the IT infrastructure, and monitors normal operation and detect and escalate exception conditions.
Base on that definition, we can use a racecar analogy. In a professional race car, you have a set of gauges to pay attention to. You have the Tachometer gauge, Oil pressure gauge, Oil Temperature, Water temperature gauge and other gauges that probably monitor the status of other components of the race car.
The data on these main gauges are sent to the team on the sidelines. The team who monitors these data is comparable to the Event Management team. For the purpose of this analogy, let’s call them “Racing team Event Management”. Any red, orange or what color have you flagged in the on-board gauges is track by this team. If the race car’s engine overheats, it will be detected by one of the gauges on board the race car and the data will be sent to the Racing team Event Management – the engine overheating is an event; and is handled by Racing team Event Management team. In a real life scenario in an IT environment, the race car could be a server hosting multiple business critical applications. The race car’s engine overheating could be a high CPU utilization and your Event Management team could be the the Tivoli/Sitescope guys, level 1 tech support team or a dedicated Event Management team. Remember that in ITIL, an individual or a team can wear multiple hats.
When an event is detected, there is predetermined and agreed upon process that will be triggered by every event, either automatically or manually. These could range from as simple as pay extra attention for the next alert and see if the CPU usage increases by another 10%. Or immediately raise an Incident record and kick off Major Incident Handling process.
What Event Management is;
• Monitor normal operation (BAU) activities
• Flags occurrence that might be of interest depen** on the organization’s definition of what is of “interest” to the business. “Interest” in ITIL terms is called exceptional conditions.
• Escalation of any exception condition/interest. Notice the use of the word escalation; it does not say resolution of.
Remember the keyword here, escalation is for Event Management (triggers a predetermined process) and resolution is for Incident Management (goal is immediate restoration of the degraded service).
• Event Management serves as the entry point or the trigger for most of the Service Operations’ define task (i.e. Service Operation process: if an event alert was triggered for a 98% usage of ink printer, then notify Desktop Support team to change the printer’s ink cartridge)
• Event Management provides a way to compare the actual performance and the behavior of a system against the agreed upon and designed standards (i.e. SLA).
• It gives inputs for Service Assurance and Continual Service Improvement to possibly adjust the service’s standards or process as needed to meet the business requirements.
What Event Management is not;
• Event Management is NOT intended to resolve or implement permanent fixes to events that were identified.
• Event Management is NOT to identify improvements based on the reports that they will generate. That is the job for CSI. Event Management only provide statistical data and information to CSI.
Incident Management
Incident Management on the other hand concentrates on restoring unexpectedly degraded or disrupted services to users as quickly as possible, in order to minimize business impact.
Event Management identifies occurrence of items that are of interest to the business. In that predetermined list of events, there are some of those events that fall under the defined exceptional condition. A good example of which is that if the response time of a business critical applications increases by 88%, then trigger an alert and raise an Incident record for the Incident Management team. From this example, you can see the process signature of an event; condition meet, send an alert and then trigger the operational tasks. Operational task can be routine (for valid events) and quite complex resolution for valid exceptional conditions (which are mostly escalated as an Incident).
What Incident Management is;
• Incident Management has a sole purpose that is to restore service to the users back to normal standards and acceptable level.
• Incident Management deals with all incidents; this includes failures, reported by event monitoring tools, and/or questions/inquiries reported by users (either via phone through the Service Desk or via an Incident record).
• Incident Management provides inputs to Problem Management team as needed to assist in the identification of the underlying root cause of the incident.
• Incident Management provide statistical data (as required) to CSI. The data could include number of repeat incidents, incidents triggered by a change, and average incident duration before resolution among other information. Some organizations call this as KPI (Key Performance Index).
What Incident Management is Not;
• Incident Management is NOT responsible for implementing a permanent fix or permanent solution to an issue. Incident Management’s purpose is to restore the service the soonest possible time. It does not matter if the solution used is a “b le gum” solution (i.e patching a b le gum to the hole on a boats hull so that it can manage to reach the other end of the river and deliver the goods that are on-board).
• Incident Management does NOT and is NOT responsible for identifying the root cause of an issue.
注:此文章转载自:h... ncident-management/ {:soso_e179:} 马上对event和incident的区别了解了更多
页:
[1]