UNIX Fault Management

UNIX Fault Management : A Guide for System Administrators

3.66 (9 ratings by Goodreads)
By (author)  , By (author) 

List price: US$39.99

Currently unavailable

Add to wishlist

AbeBooks may have this title (opens in new window).

Try AbeBooks

Description

-- Establishing cost-effective, reliable system monitoring procedures on any UNIX system -- including clusters.-- Identifying, investigating, and recovering from server problems.-- Specific approaches to monitoring systems, disks, networks, applications, and databases -- including best practices for enterprise-class UNIX installations!If you're responsible for maintaining the integrity and availability of a mission-critical UNIX system, this is the first book that brings together all the information you need most. UNIX Fault Management Administrator's Handbook describes exactly how to implement appropriate, cost-effective system monitoring on any UNIX server, including systems configured as high availability clusters. You'll find detailed descriptions of fault monitoring tools and monitoring frameworks to help you make better purchasing decisions; a detailed overview of the monitoring tasks operators perform; and specific techniques for investigating and recovering from problems. The book includes coverage of monitoring systems, disks, networks, applications, and databases, as well as specific fault management techniques for large-scale enterprises.show more

Product details

  • Paperback | 368 pages
  • 175.26 x 231.14 x 17.78mm | 430.91g
  • Pearson Education (US)
  • Prentice Hall
  • Upper Saddle River, United States
  • English
  • w. figs.
  • 013026525X
  • 9780130265258

Table of contents

1. Analyzing the Role of System Operators. Trends in System Operations.2. Enumerating Possible Events. Defining Fault Management. Event Categories. Configuration Events. Faults. Resource and Performance Events. Security Intrusions. Environmental Changes.3. Using Monitoring Frameworks. Distinguishing Monitoring Frameworks. Monitored Components. Monitoring Features. Monitor Discovery and Configuration. Monitor Developer's Kits. Notification Methods. Diagnostic Capabilities. IT/Operations. Monitored Components. Monitoring Features. Monitor Discovery and Configuration. Monitor Developer's Kit. Notification Methods. Diagnostic Capabilities. Additional Information. Unicenter TNG. Monitored Components. Monitoring Features. Monitor Discovery and Configuration. Monitor Developer's Kit. Notification Methods. Diagnostic Capabilities. Additional Information. Event Monitoring Service. Monitored Components. Monitoring Features. Monitor Discovery and Configuration. Monitor Developer's Kit. Notification Methods. Diagnostic Capabilities. Additional Information. PLATINUM ProVision. Monitored Components. Monitoring Features. Monitor Discovery and Configuration. Monitor Developer's Kit. Notification Methods. Diagnostic Capabilities. Additional Information. BMC PATROL. Monitored Components. Monitoring Features. Monitor Discovery and Configuration. Monitor Developer's Kit. Notification Methods. Diagnostic Capabilities. Additional Information. MeasureWare. Monitored Components. Monitoring Features. Monitor Discovery and Configuration. Monitor Developer's Kit. Notification Methods. Diagnostic Capabilities. Additional Information.4. Monitoring the System. Identifying Important System Monitoring Categories. Monitoring System Configuration Changes. Monitoring System Faults. Monitoring System Resource Utilization. Monitoring System Security. Monitoring System Performance. Using Standard Commands and Tools. bdf and df. ioscan. iostat. ipcs. mailstats. ps. sar. swapinfo. sysdef. sysdef. timex. top. uname. uptime. vmstat. who. Using System Instrumentation. SNMP. DMI. Using Graphical Status Monitors. OpenView Network Node Manager. ClusterView. Unicenter TNG. Using Event Monitoring Tools. Event Monitoring Service. EMS High Availability Monitors. EMS Hardware Monitors. Enterprise SyMON. OpenView IT/Operations. GlancePlus Pak 2000. Security Monitoring. Security Overview. Security Monitoring Tools. Using Diagnostic Tools. Support Tool Manager. HP Predictive Support. HA Observatory. Monitoring System Peripherals. Disks. Tapes. Printers. Collecting System Performance Data. MeasureWare. GlancePlus. PerfView. BMC PATROL for UNIX. Candle. Using System Performance Data. Avoiding Performance Issues. Detecting CPU Contention. Checking System Resource Usage. Detecting Memory and Swap Contention. Detecting Disk and File System Bottlenecks. Avoiding System Problems. Recovering from System Problems. Comparing System Monitoring Tools. Case Study: Recovering from Memory Faults. Verifying Configuration. Setting Up Monitoring and Reconfiguration. Memory Board Failure Occurs. Fixing the Failure and Restoring Service.5. Monitoring the Disks. Identifying Important Disk Monitoring Categories. Using Standard Commands and Tools. bdf and df. diskinfo. fsck. ioscan. lvdisplay. pvdisplay. vgdisplay. Using System Instrumentation. Simple Network Management Protocol. Desktop Management Interface. Using Event Monitoring Tools. Event Monitoring Service Disk Volume Monitor. EMS Hardware Monitors. HARAYMON and ARRAYMOND. OpenView IT/Operations. Enterprise SyMON. Using Diagnostic Tools. Support Tool Manager. HP Predictive Support. Collecting Disk Performance Data. MeasureWare. GlancePlus. PerfView. BMC PATROL. Using Disk Performance Data. Avoiding Disk Problems. Recovering from Disk Problems. Comparing Disk Monitoring Products. Case Study: Configuring and Monitoring for Mirrored Disks. Verifying Configuration. Setting Up Monitoring. Mirror Fails. Restoring Mirrors. Verifying Configuration.6. Monitoring the Network. Identifying Important Network Components to Monitor. Using Graphical Network Status Monitors. Network Node Manager. IT/Operations. Unicenter TNG. Enterprise SyMON. Monitoring Network Interface Card and Cable Failures. Using SNMP Instrumentation. Using Standard Commands and Tools. Using Additional Products To Monitor Network Links. Using Link-Specific Commands. Monitoring Networking and Transport Protocols. Using SNMP Instrumentation. Using Standard Commands and Tools. Monitoring Network Services. Monitoring DHCP/BOOTP Servers. Monitoring DNS/NIS Name Servers. Monitoring FTP. Monitoring NFS. Monitoring Remote Connectivity. Monitoring Web Servers. Monitoring Network Hosts. Network Node Manager. netstat. Interconnect & Router Manager. Collecting Network Performance Data. Using RMON and RMON-II Instrumentation. NetMetrix Site Manager. MeasureWare. GlancePlus. PerfView. BMC PATROL for \. Network General Sniffer Pro. Using Network Performance Data. Avoiding Performance Issues. Detecting Overloaded Network Servers. Detecting Network Congestion. Avoiding Network Problems. Recovering from Network Problems. Isolating the Fault. Network and Lower Layers. Transport and Higher Layers.7. Monitoring the Application. Important Application Components to Monitor. Identifying Application Types. Using Standard Commands and Tools. ps. top. vmstat. Using System Instrumentation. SNMP. DMI. pstat. Fault Detection Tools. IT/Operations. MC/ServiceGuard. ClusterView. Event Monitoring Service. EcoSNAP. Monitoring Tools for ERP Applications. Envive. SMART Plug-Ins. BMC PATROL Knowledge Modules. EcoSYSTEMS. Resource and Performance Monitoring Tools. Application Resource Measurement. MeasureWare. GlancePlus. PerfView. Process Resource Manager. Controlling Application Performance. Recovering from Application Problems. Comparison of Application Monitoring Products.8. Monitoring the Database. Identifying Important Database Monitoring Categories. Configuring the Database. Watching for Database Faults. Managing Database Resources and Performance. Keeping the Database Server Secure. Ensuring Successful Database Backups. Using Standard Database Commands and Tools. UNIX Commands. SQL Commands. SNMP MIB Monitoring. Database Vendor Tools. Using Fault Detection and Recovery Tools. MC/ServiceGuard. ClusterView. EMS HA Monitors. Resource and Performance Monitoring Tools. Application Resource Measurement. Oracle Trace. Oracle V$ Tables. GlancePlus Pak 2000. Oracle Management Pak. PerfView. SMART Plug-Ins for Databases. BMC PATROL Knowledge Modules. PLATINUM DBVision. Using Database Performance Data. Avoiding Performance Issues. Checking for System Contention. Checking for Disk Bottlenecks. Checking Database Buffer and Pool Sizes. Avoiding Database Problems. Recovering from Database Problems. Comparison of Database Monitoring Products.9. Enterprise Management. Monitoring Across an Enterprise. Identifying Events. Using Event Correlation Tools. OpenView Event Correlation Services. Seagate NerveCenter. IT Masters MasterCell. Monitoring Multiple Systems. IT/O. ClusterView. Enterprise Management Frameworks. ClusterView. Monitoring Agents. Using Multiple Tools. IT/Operations and the Network Node Manager. IT/O and PerfView. BMC PATROL and IT/O. PLATINUM ProVision and IT/O. EMS and OpenView NNM or IT/O.10. UNIX Futures. Future Trends in Fault Management.Appendix A:. Glossary.Index.show more

About Brad Stone

Brad Stone and Julie Symons are UNIX professionals at Hewlett Packard in Cupertino, CA.show more

Rating details

9 ratings
3.66 out of 5 stars
5 44% (4)
4 11% (1)
3 22% (2)
2 11% (1)
1 11% (1)
Book ratings by Goodreads
Goodreads is the world's largest site for readers with over 50 million reviews. We're featuring millions of their reader ratings on our book pages to help you find your new favourite book. Close X