NLA eClips Service Incident Report
NLA eClips Service Incident - Report
Problem:
At approximately 9:30 this morning an incident occurred which impacted eClips service delivery. The incident was resolved at 10:00. During this incident, clients ability to view eClips content was impacted as the service was intermittently unavailable.
Cause:
The root cause of the incident is still under investigation by NLA engineers, however indications show that when attempting to serve a higher than normal number of requests, the eClips database license checking process became less responsive and is being investigated as a potential area requiring optimization.
Solution:
NLA engineers are now reviewing the eClips core code related to this aspect of the service with the aim of discovering the root cause and optimizing it to prevent reoccurrence.
The NLA engineering team is also preparing to deploy a new database architecture which will be more resilient and scalable. This should also have the benefit of preventing such an incident from occurring.
NLA Service Operations Management
CANCELLED: NLA maintenance - Saturday 4th October 2008 at 13:00
NLA engineers will be carrying out configuration changes to the NLA eClips environment on Saturday 4th October at 13:00. Web and FTP services will be intermittently unavailable for up to three hours while the work takes place.
Notifications will be sent before the work begins and as soon as it is complete.
We apologise for any inconvenience this may cause.
NLA eClips Email Issues
Yesterday the NLA implemented changes to our email system which creates further resilience and fault tolerance. Unfortunately, during the implementation, new rules and restrictions within the system blocked some emails from being passed to third party recipients. In particular, this issue impacted client requests sent to the reprocessing@nla.co.uk alias. All requests sent yesterday to this alias were not routed appropriately to Ninestars and therefore any requests for reprocessing were not performed.
The NLA apologises for any impact this may have had on PCA eClips production workflow. The issue has now been resolved and all emails are being routed appropriately.
NLA eClips Service Incident - Report
Problem:
At approximately 9:20 yesterday an incident occurred which impacted eClips service delivery. The incident was resolved at 10:15. During this incident, clients ability to view eClips content was impacted as the service was intermittently unavailable.
Cause:
The root cause of the incident is still under investigation by NLA engineers, however indications show that when attempting to serve a high number of multi-object requests containing more than the permitted 100 objects, the eClips web application spawned unnecessary additional connections to the eClips database which in turn impacted performance. Yesterday, we experienced a higher than normal number of large multi-object requests.
Solution:
Once the cause of the incident was understood by NLA engineers the service was restarted and returned to normal operation immediately. NLA engineers are now putting in place further monitoring which will provide an earlier warning of the potential reoccurrence of this type of incident. The NLA is also investigating the eClips core code related to this aspect of the service with the aim of discovering the root cause and redeveloping it to prevent reoccurrence.
Separately, the NLA engineering team is preparing to deploy a new database architecture which will be more resilient and scalable. This should also have the benefit of preventing such an incident from occurring.
NLA eClips: Emergency Maintenance
NLA engineers will be carrying out emergency maintenance on the eClips database storage device today at 19:00. This will require stopping all eClips web and FTP services for up to 60 minutes.
Further notification will be sent when the work is complete.
We apologise for any inconvenience this may cause.
