Home > Store

Practice of System and Network Administration, The

Register your product to gain access to bonus material or receive a coupon.

Practice of System and Network Administration, The

Book

  • Sorry, this book is no longer in print.
Not for Sale

Description

  • Copyright 2002
  • Edition: 1st
  • Book
  • ISBN-10: 0-201-70271-1
  • ISBN-13: 978-0-201-70271-2

"Your organization needs this book!"
--Peter Salus, Chief Knowledge Officer, Matrix.Net, "The Bookworm"

This book describes the best practices of system and network administration, independent of specific platforms or technologies. It features six key principles of site design and support practices: simplicity, clarity, generality, automation, communication, and basics first. It examines the major areas of responsibility for system administrators within the context of these principles. The book also discusses change management and revision control, server upgrades, maintenance windows, and service conversions. You will find experience-based advice on topics such as:

  • The key elements your networks/systems need that will make all other services run better
  • Building and running reliable, scalable services, including email, printing, and remote access
  • Creating security policies and enforcing them
  • Upgrading thousands of hosts without creating havoc
  • Planning for and performing flawless scheduled maintenance windows
  • Superior helpdesks, customer care, and avoiding the temporary fix trap
  • Building data centers that prevent problems
  • Designing networks for speed and reliability
  • Email scaling and security issues
  • Why building a backup system isn't about backups
  • Monitoring what you have and predicting what you will need
  • How to stay technical and how not to be pushed into management

And there's more! When was the last time you read a book that dealt with:

  • Real-world technical management issues, including morale, organization building, coaching, maintaining positive visibility, and communicating with nontechnical management
  • Personal skill techniques, including our secrets for getting more done each day, dealing with less technical people, ethical dilemmas, managing your boss, and loving your job
  • System administration salary negotiation tips--the first book that includes this topic!

Chapters are divided into The Basics and The Icing. The Basics are those key elements that, when done right, make every other aspect of the job easier. Things like starting all new hosts with the same configuration and picking the right things to automate first. The Icing sections contain all those powerful things that can be done on top of the basics to wow customers and managers. Do the basics first. The icing is a vision for the future that usually only comes with decades of experience.



0201702711B07232001

Extras

Related Article

Diary of a Network Administrator: Mean People Suck

Author's Site

Click below for Web Resources related to this title:
Author's Web Site

Sample Content

Downloadable Sample Chapter

Click below for Sample Chapter related to this title:
limoncellich.pdf

Table of Contents



Preface.


Acknowledgments.


About the Authors.


Introduction.


Do These Now!

Use a Trouble-Ticket System.

Manage Quick Requests Right.

Start Every New Host in a Known State.



Conclusion.

I. THE PRINCIPLES.

1. Desktops.

The Basics.

Loading the System Software and Applications Initially.

Updating the System Software and Applications.

Network Configuration.

Dynamic DNS with DHCP.

The Icing.

High Confidence in Completion.

Involve Customers in the Standardization Process.

A Variety of Standard Configurations.

Conclusion.

2. Servers.

The Basics.

Buy Server Hardware for Servers.

Vendors Known for Reliable Products.

Does Server Hardware Really Cost More?

Maintenance Contracts and Spare Parts.

Data Backups.

Servers Live in the Data Center.

Same, Different, or a Stripped-Down OS on Clients.

Remote Administration Access.

Mirrored Root Disks.

The Icing.

Server Appliances.

Redundant Power Supplies.

Full and n + 1</I> Redundancy.

Hot-swap Components.

Separate Networks for Administrative Functions.

Opposing View: Many Inexpensive Workstations.

Conclusion.

3. Services.

The Basics.

Customer Requirements.

Operational Requirements.

Open Architecture.

Simplicity.

Vendor Relations.

Machine Independence.

Environment.

Restricted Access.

Reliability.

Single or Multiple Servers.

Centralization and Standards.

Performance.

Monitoring.

Service Rollout.

The Icing.

Dedicated Machines.

Full Redundancy.

Conclusion.

4. Debugging.

The Basics.

Learn the Customer's Problem.

Find the Problem's Cause and Fix It.

Have the Right Tools.

The Icing.

Better Tools.

Formal Training on the Tools.

End-to-End Understanding of the System.

Conclusion.

5. Fixing Things Once.

The Basics.

Fix Things Once, Rather Than Over and Over.

Avoid the Temporary Fix Trap.

Learning from Carpenters.

The Icing.

Conclusion.

6. Namespaces.

The Basics.

Namespaces Need Policies.

Namespaces Need Change Procedures.

Namespace Management Should Be Centralized.

The Icing.

One Huge Database That Drives Everything.

Further Automation.

Customers Do Many of the Updates.

Next-Level Namespace Ubiquity.

Conclusion.

7. Security Policy.

The Basics.

Build Security Using a Solid Infrastructure.

Ask the Right Questions.

Document the Company's Security Policies.

Basics for the Technical Staff.

Management and Organizational Issues.

The Icing.

Make Security Pervasive.

Stay Up-to-Date: Contacts and Technologies.

Produce Metrics.

Organization Profiles.

Small Company.

Medium-Size Company.

Large Company.

E-commerce Site.

University.

Conclusion.

8. Disaster Recovery and Data Integrity.

The Basics.

What Is a Disaster?

Risk Analysis.

Legal Obligations.

Damage Limitation.

Preparation.

Data Integrity.

The Icing.

Redundant Site.

Security Disasters.

Media Relations.

Conclusion.

9. Ethics.

The Basics.

Informed Consent.

Professional Code of Conduct.

Network/Computer User Code of Conduct.

Privileged Access Code of Conduct.

Copyright Adherence.

Working with Law Enforcement.

The Icing.

Setting Expectations on Privacy and Monitoring.

Being Told to Do Something Illegal/Unethical.

Conclusion.

II. THE PROCESSES.

10. Change Management and Revision Control.

The Basics.

Technical Issues.

Communications Structure.

Scheduling.

Process and Documentation.

Quiet Times.

The Icing.

Automated Front-Ends.

Change Management Meetings.

Streamline the Process.

Conclusion.

11. Server Upgrades.

The Basics.

The Steps in Detail.

The Icing.

Add and Remove Services at the Same Time.

Fresh Installs.

Reusing the Tests.

System Changelog.

A Dress Rehearsal.

Install Old and New Versions on the Same Machine.

Minimal Changes From the Base.

Conclusion.

12. Maintenance Windows.

The Basics.

Scheduling.

Planning.

Flight Director.

Change Proposals.

The Master Plan.

Disabling Access.

Mechanics and Coordination.

Deadlines for Change Completion.

Comprehensive System Testing.

Postmaintenance Communication.

Re-enable Remote Access.

Visible Presence the Next Morning.

Postmortem.

The Icing.

Mentoring a New Flight Director.

Trending of Historical Data.

Providing Limited Availability.

High-Availability Sites.

The Similarities.

The Differences.

Conclusion.

13. Service Conversions.

The Basics.

Small Groups First, Then Expand Communication.

Minimize Intrusiveness.

Layers Versus Pillars.

Avoid Flash-Cuts.

Successful Flash-Cuts.

Back-Out Plan.

The Icing.

Instant Roll-Back.

Avoid Explicit Conversions.

Vendor Support.

Conclusion.

14. Centralization and Decentralization.

The Basics.

Guiding Principles.

Candidates for Centralization.

Candidates for Decentralization.

The Icing.

Consolidate Purchasing.

Outsourcing.

Conclusion.

III. THE PRACTICES.

15. Helpdesks.

The Basics.

Have a Helpdesk.

A Friendly Face.

Staff Sizing.

Defined Scope of Coverage.

Defined Processes for Sta.

An Escalation Process.

Helpdesk Software.

The Icing.

Statistical Improvements.

Out of Hours and 24 x 7 Coverage.

Better Advertising for the Helpdesk.

Different "Desks" for Service Provision Versus Problem Resolution.

Conclusion.

16. Customer Care.

The Basics.

Ticket Tracking Software.

Phase A: The Greeting.

Phase B: Problem Identification ("What's Wrong?").

Phase C: Planning and Execution ("Fix It").

Phase D: Verification ("Verify It").

Perils of Skipping a Step.

Team of One.

The Icing.

Training Based on the Model.

The Single Point of Contact.

Increasing Customer Familiarity.

Special Announcements for Major Outages.

Trend Analysis.

Customers That Know the Process.

Architectural Decisions That Match the Process.

Conclusion.

17. Data Centers.

The Basics.

Picking a Location.

Access.

Security.

Power and Air.

Fire Suppression.

Racks.

Wiring.

Labeling.

Communication.

Console Servers.

Workbench.

Tools and Supplies.

Parking Spaces.

The Icing.

Greater Redundancy.

More Space.

Ideal Data Centers.

Tom's Dream Data Center.

Christine's Dream Data Center.

Conclusion.

18. Networks.

The Basics.

The OSI Model.

Clean Architecture.

Network Topologies.

Intermediate Distribution Frame.

Main Distribution Frame.

Demarcation Points.

Documentation.

Simple Host Routing.

Use Network Devices.

Overlay Networks.

Number of Vendors.

Standards-Based Protocols.

Monitoring.

Single Administrative Domain.

The Icing.

Leading-Edge Versus Reliability.

Multiple Administrative Domains.

Conclusion.

19. Email Service.

The Basics.

Privacy Policy.

Namespaces.

Reliability.

Simplicity.

Generality.

Automation.

Basic Monitoring.

Redundancy.

Scaling.

Security Issues.

Communication.

The Icing.

Encryption.

Backup Policy.

Advanced Monitoring.

High-Volume List Processing.

Conclusion.

20. Print Service.

The Basics.

Select the Level of Centralization.

Print Architecture Policy.

Designing the System.

Documentation.

Monitoring.

Environmental Issues.

The Icing.

Automatic Fail-Over and Load Balancing.

Dedicated Clerical Support.

Shredding.

Dealing with Printer Abuse.

Conclusion.

21. Backup and Restore.

The Basics.

Three Reasons for Restores.

The Backup Schedule.

Time and Capacity Planning.

Consumables Planning.

The Restore Process.

Backup Automation.

Centralization.

Tape Inventory.

The Icing.

Firedrills.

Backup Media and Off-Site Storage.

High DB Availability.

Technology Changes.

Conclusion.

22. Remote Access Service.

The Basics.

Remote Access Requirements.

Define a Remote Access Policy.

Define Service Levels.

Centralization.

Outsourcing.

Authentication.

Perimeter Security.

The Icing.

Home Office.

Cost Analysis and Reduction.

New Technologies.

Conclusion.

23. Software Depot Service.

The Basics.

Understand the Justification.

Understand the Technical Expectations.

Set the Policy.

Selecting Depot Software.

Create the Process Manual.

A Unix Example.

A Windows Example.

The Icing.

Different Configurations for Different Hosts.

Local Replication.

Including Commercial Software in the Depot.

Handling Second-Class Citizens.

Conclusion.

24. Service Monitoring.

The Basics.

Historical Data.

Real-Time Monitoring.

The Icing.

Accessibility.

Pervasive Monitoring.

Device Discovery.

End-to-End Tests.

Application Response Time Monitoring.

Scaling.

Conclusion.

IV. MANAGEMENT.

25. Organizational Structures.

The Basics.

Sizing.

Cost Centers.

Management Chain.

Appropriate Skills.

Infrastructure Teams.

Customer Support.

Helpdesk.

Outsourcing.

The Icing.

Consultants and Contractors.

Sample Organizational Structures.

Small Company.

Medium Company.

Large Company.

E-commerce Site.

Universities and Non-Profit Organizations.

Conclusion.

26. Perception and Visibility.

The Basics.

A Good First Impression.

Attitude, Perception, and Customers.

Align Your Priorities with Customer Expectations.

Be the System Advocate.

The Icing.

The System Status Web Page.

Management Meetings.

Be Visible.

Town Meetings.

Newsletters.

Mail to All Customers.

Lunch.

Conclusion.

27. Being Happy.

The Basics.

Organizing for Excellent Follow-Through.

Time Management.

Communication Skills.

Constant Professional Development.

Staying Technical.

The Icing.

Learn to Negotiate.

Loving Your Job.

Managing Your Manager.

Further Reading.

Conclusion.

28. A Guide for Technical Managers.

The Basics.

Responsibilities.

Working with Nontechnical Managers.

Working with Your Employees.

Decisions.

The Icing.

Make Your Team Even Stronger.

Sell Your Department to Senior Management.

Work on Your Own Career Growth.

Do Something You Enjoy.

Conclusion.

29. A Guide for Nontechnical Managers.

The Basics.

Morale.

Communication.

Sta Meetings.

Look for One-Year Plans.

Technical Staff and the Budget Process.

Professional Development.

The Icing.

Have a Five-Year Vision.

Meetings with Single Point of Contact.

Understand the Technical Staff's Work.

Conclusion.

30. Hiring System Administrators.

The Basics.

Job Description.

Skill Level.

Recruiting.

TimingIs Everything.

Team Considerations.

Select the Interview Team.

Interview Process.

Technical Interviewing.

Nontechnical Interviewing.

Sell the Position.

Employee Retention.

The Icing.

Get Noticed.

Conclusion.

31. Firing System Administrators.

The Basics.

Follow Your Corporate HR Policy.

Remove Physical Access.

Remove Remote Access.

Remove Service Access.

Fewer Access Databases.

The Icing.

A Single Authentication Database.

Monitoring System File Changes.

Conclusion.

Epilogue.
Appendix A. The Many Roles of a System Administrator.
Appendix B. What to Do When . . .
Appendix C. Acronyms.
Bibliography.
Index. 0201702711T08072001

Preface

The goal of this book is to write down all the things that we've learned from our mentors and our real-world experiences. These are the things that are beyond what the manuals and the usual system administration books teach. System administrators (SAs) often find themselves swamped with work, struggling to keep the site running, and faced with requests for new technologies from their customers. Servers are overloaded or unreliable, but fixing the problem requires weeks of planning and painstakingly untangling a mess of services so that they can be moved to new machines. Hidden dependencies are lurking around every corner, and getting bitten by one can be catastrophic. In the meantime, repetitive day-to-day tasks still need to be done. The challenges seem insurmountable.

Most sites grow organically, with little thought given to the big picture as each little change is implemented. Haphazardly, SAs learn about the fundamentals of good site design and support practices. They are taught by mentors, if at all, about the importance of simplicity, clarity, generality, automation, communication, and doing the basics first. These six principles are recurring themes in this book.

  • Simplicity means that the smallest solution that solves the entire problem is the best solution. It keeps the systems easy to understand and reduces complex interactions between components that can cause debugging nightmares.
  • Clarity means that the solution is not convoluted. It can be easily explained to someone on the project or even outside the project. Clarity makes it easier to change the system, as well as to maintain and debug it.
  • Generality means that the solution solves many problems at once. Sometimes the most general solution is the simplest. It also means using vendor-independent open standard protocols that make systems more exible and make it easier to link software packages together for better services.
  • Automation is critical. Manual processes cannot be repeated accurately nor do they scale as well as automated processes. Automation is key to easing the system administration burden, and it eliminates tedious repetitive tasks and gives SAs more time to improve services.
  • Communication between the right people can solve more problems than hardware or software. You need to communicate well with other SAs and with your customers. It is your responsibility to initiate communication. Communication ensures that everyone is working toward the same goals. Lack of communication leaves people concerned and annoyed. Communication also includes documentation: document customers needs to make sure you agree on them, document design decisions you make, document maintenance procedures. Documentation makes systems easier to maintain and upgrade. Good communication and proper documentation also make it easier to hand off projects and maintenance when you leave or take on a new role.
  • Doing the basics first means that you build the site on strong foundations by identifying and solving the basic problems before trying to attack more advanced ones. Doing the basics first makes adding advanced features considerably easier, and it makes services more robust. A good basic infrastructure can be repeatedly leveraged to improve the site with relatively little effort. Sometimes we see SAs at other sites making a huge effort to solve a problem that wouldn't exist, or would be a simple enhancement, if the site had a basic infrastructure in place. This book will help you identify what the basics are and show you how the other five principles apply. Each chapter looks at the basics of a given area. Get the fundamentals right, and everything else will fall into place.

These principles are universal. They apply at all levels of the system. They apply to physical networks and to computer hardware. They apply to all operating systems running at the site, all protocols used, all software, and all services provided. They apply at universities, non-profit institutions, government sites, businesses, and Internet service sites.

What Is an SA?

It's difficult to define what a system administrator is. Every company calls SAs something different. Sometimes they are called network administrators, system architects, or operators. Maybe the name isn't important a rose by any other name . . .

Explaining What System Administration Entails
It's difficult to define system administration, but trying to explain it to a nontechnical person is even more difficult, especially if that person is your mom. Moms have the right to know how their offspring are paying their rent. A friend of Christine's always had trouble explaining to his mother what he did for a living and ended up giving a different answer every time she asked. Therefore she kept repeating the question every couple of months, waiting for an answer that would be meaningful to her. Then he started working for WebTV. When the product became available, he bought one for his Mom. From then on, he told her that he made sure that her WebTV service was working and was as fast as possible. She was very happy that she could now show her friends something and say, "That's what my son does!"

System administrators do many things. They look after computers, networks, and the people who use them. An SA may look after hardware, operating systems, software, configurations, applications, or security. A system administrator is someone who influences how effectively other people can use their computers and networks.

System Administration Matters

System administration matters because computers and networks matter. Computers are a lot more important than they were years ago. What happened?

First of all, the technology has changed. Corporate computers used to be independent, now they are connected. Business processes used to have a component that involved using a computer, now entire processes are done online and come to a halt if any part of the system is broken.

The widespread use of the Internet, intranets, and the move to a dot com world has redefined the way companies depend on computers. The Internet is a 24 x 7 operation, and sloppy operations can no longer be tolerated. A paper purchase order can be processed any time, anywhere; therefore there is an expectation that the computer system that automates the process will be available all the time, from anywhere. Nightly maintenance windows have become an unheard of luxury. That unreliable power system in the machine room that caused occasional but bearable problems now prevents sales from being recorded.

The biggest change, however, is due to CEOs putting a new importance on computing. In business, nothing is important unless the CEO feels it is important. The CEO controls funding and sets priorities. Now CEOs have become dependent on email. They notice when an outage or an overloaded system slows down their email. The massive preparations for Y2K also brought home to CEOs how dependent their organizations have become on computers.

We use the term chief executive officer (CEO) loosely to mean the top person in an organization. Educational institutions have CEOs, they're just referred to as president, provost, proctor, or head. Governments have CEOs they're just referred to as mayor, governor, Prime Minister, leader, or President.

Management now has a more realistic view of computers. Previously people had unrealistic ideas of what computers could do; seeing them as portrayed in film: big, all-knowing, self-sufficient, miracle machines. This has changed. Even the need for SAs is now portrayed in films. In 1993, Jurassic Park (Crichton 1993) was the first mainstream movie to portray computers as needing system administration, leading to a better public understanding of what it is.

Computers matter more than ever. If computers are to work and work well, then system administration matters. We matter.

About the Book

This book was born from our experiences as SAs in a variety of companies. We have helped sites to grow. We have worked at small start-ups and universities, where lack of funding was an issue. We have worked at mid-size and large multinationals, where mergers and spin-offs give rise to more challenges. We ve worked at fast-paced companies that do business on the Internet and have high-availability, high-performance, and rapid scaling issues. On the surface, these are very different environments with diverse challenges. But underneath, they all need the same building blocks, and the same fundamental principles apply.

This book gives you a framework a way of thinking about system administration problems rather than a narrow how-to solution to a particular problem. Given a solid framework, you can solve problems every time they appear, no matter what operating system (OS), brand of computer, or type of environment. This book is unique because it looks at system administration from this point of view, whereas most books for SAs focus on how to maintain one particular type of OS. With experience, however, all SAs learn that the big-picture problems and solutions are largely independent of the platform. This book will change the way you approach your work as an SA and the way you view the site you maintain.

The principles in this book apply to all environments. The approaches described may need to be scaled up or down, depending on your environment, but the basic principles still apply. In chapters where we felt that how to apply the information to other environments might not be obvious, we have included a section that illustrates how to apply the principles at different companies.

This book is not about how to configure or debug a particular OS. It will not tell you how to recover the shared libraries or DLLs when someone accidentally moves them. There are some excellent books that do cover those topics, and we will refer you to many of them throughout the book. What we will discuss here are the principles of good system administration, both basic and advanced, that we have learned through our own and others experiences. These principles apply to all OSs. Following them well can make your life a lot easier. If you improve the way you approach problems, the benefit will be multiplied. Get the fundamentals right, and everything else falls into place. If they aren't done well, you will waste time repeatedly fixing the same things, and your customers2 will be unhappy because they can't work effectively with broken machines.

2Throughout the book we refer to the end-user of our systems as customers rather than users. A detailed explanation of why we do this is in Section 26.1.2.

We believe that SAs of all levels will benefit from reading this book. It gives junior SAs insight into the bigger picture of how sites work, their roles in the organizations, and how their careers can progress. Intermediate SAs will learn how to approach more complex problems and how to improve the sites, making their jobs easier and more interesting and their customers happier. It will help you to understand what is behind your day-to-day work, to learn the things that you can do now to save time in the future, to decide policy, to be architects and designers, to plan far into the future, to negotiate with vendors, and to interface with management. These are the things that concern senior SAs. None of them are listed in an OS's manual. Even senior SAs and systems architects can learn from our experiences and the experiences of our colleagues that are captured in these pages, as we have learned from each other in writing this book. We also cover several management topics, both for SA managers and for SAs who aspire to move into management.

The easiest way to learn usually is by example, particularly in the case of practical areas like system administration. Throughout the book, we use examples to illustrate the points we are making. The examples are mostly from medium or large sites, where scale adds its own problems. Typically, the examples are generic rather than specific to a particular OS, although some are OS-specific, usually Unix or Windows. One of the strongest motivations we had for writing this book is the understanding that the problems SAs face are the same across all OSs. A new OS that is significantly different from what we are used to can seem like a black box, a nuisance, or even a threat. However, despite the unfamiliar interface, as we get used to the new technology, eventually we realize that we face the same set of problems in deploying, scaling, and maintaining the new OS. Recognizing that fact, knowing what problems need solving, and understanding how to approach the solutions by building on experience with other OSs let us master the new challenges more easily.

We want this book to be something that changes your career. We want you to become so successful that if you see us on the street you'll give us a great big hug.

Organization

This book has four major parts:

  • Part I, The Principles, discusses the most basic issues SAs deal with, but we view them from the perspective of the frameworks that will lead you to doing them well.
  • Part II, The Processes, deals with change and the frameworks for making changes in ways that ensure success.
  • Part III, The Practices, collects our thoughts on what makes a great system, a great email service, a great print service, a great helpdesk, and so on.
  • Part IV, Management, comes next. Don't be afraid--it won't bite you. Actually, it will bite you, and we want you to be prepared. This part should help you understand your organization, your customers, yourself, and your managers. It ends with an exciting chapter on how to fire other SAs a very delicate situation indeed.

The book ends with several appendices.

  • Appendix A discusses the roles that you and others play. It's a catalog of the various people we've met or worked with and the value they bring to an organization.
  • Appendix B connects the dots. It covers many situations you may experience and points you to the various places in the book that should be helpful. Please don't look at it now because you may find it so interesting that you won't return to finish reading this preface.
  • Appendix C contains a list of acronyms used in the text.

Each chapter discusses a different topic, and the topics vary from the technical to the nontechnical. If one chapter doesn't apply to you, feel free to skip it. The chapters are linked to each other, so you may find yourself returning to a chapter that you previously thought was boring. We won't be offended.

There are two halves to each chapter: The Basics and The Icing. The Basics discusses the essentials that you just plain have to get right. Skipping any of these items will simply create more work for you in the future. Consider them investments that pay off in efficiency later on. The Icing deals with the cool things that you can do to be spectacular. Don't spend your time with these things until you are done with The Basics. We have made an attempt to drive the points home through anecdotes and case studies from personal experience. We hope that this makes the advice here more real for you. Never trust salespeople who don't use their own products.

What's Next?

Each chapter stands on its own. Feel free to jump around. However, we have carefully ordered the chapters so that they make the most sense if you read the book from start to finish. Either way, we hope you enjoy the book. We have learned a lot and had a lot of fun writing it. Let's begin.

Thomas A. Limoncelli
Lumeta Corporation
tom@limoncelli.org

Christine Hogan
Independent Consultant
chogan@chogan.com

P.S. Books, like software, always have bugs. We intend to maintain a list of updates to this book on its web site: http://www.awl.com/cseng/titles/0-201-70271-1 or our web site, http://www.EverythingSysAdmin.com. Please visit!



0201702711P08082001

Index

A


Above the fold, 145
Acceptable use policy. See AUP
Accepting
   blame for failures, 613
   criticism, 597
Accidentally deleting files, 97, 443–445
Accounting policy, 428–429
Accounts
   deleting, 106
   shared, 133–134, 136–137
ACL (Access Control List), 499
ActiveDirectory, 108
Active listening, 583–586
Active monitoring systems and security, 514–515
Ad hoc solution finders, 695
Administration, separate networks for, 44
Administrator who cried wolf, 705
AICPA (American Institute of Certified Public Accountants), 364
Alerts
   acknowledging, 514
   email, 511
   error messages, 512–513
   escalation policy, 513
   ignoring, 98
   policy for handling, 512
   proprietary information, 512
   wireless communication, 511–512
Aliases, 103–104, 107–108
Allman, Eric, 407
Analysis of UNIX System Configuration, 4–5
Angry people, 585–586
Apache server, 226
Appearance, 549
Applications
   completely tied together, 67
   initially loading, 8–14
   response time monitoring, 518
   updating, 14–17
Archival backups, 446, 448
Archival restores, 443, 445–446
Archives, storing off-site, 446
ARPAnet, 118
ATM (Asynchronous Transfer Mode), 396
ATS (Automatic Transfer Switch), 331
AT&T, 255, 258, 517
Attacks, 132–133
   DoS (denial-of-service), 150
   in-depth, 149
   social engineering, 149–150
Auditors, 144
Audits, external, 149–150
AUP (acceptable use policy), 124, 418, 437
AUSCERT (Australian Computer Emergency Response Team) web site, 132
Authentication, 133–136
   biometric mechanism, 134
   component centralization, 478
   full redundancy, 74
   mechanism problems, 134–135
   remote access, 481
   token-based system, 134
Authorization, 133–136
Authorization matrix, 135
AutoLoad, 4, 8, 11
Automated checks, 198–200
Automated front-ends for change management, 205
Automated installation system, 9–11
Automated patches, 17
Automated processes, 24–25
Automated reports and technical managers, 612
Automated updates, 17
Automatic fail-over, 435–436
Automation for
   anything, xxxv
   backups, 459–461
   change management, 196
   email, 412–413
   emailing when finished, 9–10
   fixing root problem, 99
   fixing symptoms, 98–99
   fixing things permanently, 98–99
   host&rsquo;s Ethernet MAC address, 10
   ignoring alerts, 98
   mailing list administration, 412–413
   manual processes, 9
   mistakes, 9
   partial, 11
   processes, 558–559
   quick customer requests, 554
   removing manual steps, 10
   saving money with, 8–9
   self-sufficient customers, 115
   system advocate, 557–558
   system clerk, 557–558
   Windows NT installation, 3–4
AutoPatch for Solaris, 15–16
Availability monitoring, 511

B

Back-out plans, 218–219, 222–223, 226, 263
Backup policies, 106, 417, 419–420, 448–449, 458
Backup software, 449, 461
Backups, 441–442
   archival, 446, 448
   automation, 459–461
   centralization, 461–462
   changing tapes schedule, 577–578
   clients, 35
   consumables planning, 456–457
   corporate guidelines, 447–448
   databases, 467
   disk size, 468
   dynamic schedules, 454
   80/20 rule, 451
   fire drills, 463–464
   full, 442
   general, 448
   high database availability, 467–468
   increasing cost, 442
   incremental, 442
   interconnection speeds, 456
   Internet-based, 467
   jukeboxes, 462
   manual, 460–461
   media storage, 464–467
   mirrored root disks, 40
   network-based, 462
   off-site storage, 464–467
   requiring brainwork, 460–461
   retention guidelines, 447
   restore bottlenecks, 464
   schedules, 449–455
   and SLAs, 448–449
   slowing down services, 447
   speed, 455
   tape capacity, 468
   tape inventory, 462–463
   technology changes, 468–469
   time constraints, 455
   timeframe, 447
Bagley, John, 517
Balancing work and personal life, 599–600
Bandwidth, 54–55, 508
Bell Labs, 565
   demo schedule, 201
   HHA system, 125–126
   laptop net, 24
   research division, 255–256
   setting naming standards, 110
BGP (Border Gateway Protocol), 373, 401
Biometric locks, 330–331
Blame, accepting, 613
Bleeding edgers, 705
Bonuses, 611, 623
Boot scripts reliability, 199
BOOTP (Boot Protocol), 18
Border routing protocols, 401
Bounced email, 95
British Telecom, 262
Budgets
   administrators, 699–700
   five-year vision, 647–648
   technical managers, 620
   technical staff and, 643–645
Bugtraq Web site, 132
Building solution from scratch, 631
Bulk licensing, 182
Bunkers, 328
Bureaucrats, 580–581
Burgess, Mark, 114
Business, meeting needs of, 130
Business applications support team, 153
Buy-versus-build decision, 630–633
Buying canned software, 630
Buzzword-compliant, 86

C

Cables
   color-coding, 349, 354
   data centers, 348–354
   labeling, 355, 393
   management in racks, 345–346
   optimizing, 352
   prelabeled, 355
   raised floor, 348
   separating power and data, 352–354
   test printouts, 389
   unique serial number, 392
Capacity monitoring, 511
Capacity planners, 699
Career
   goals, 601
   paths, 618–619
Careful planners, 698–699
Central hosts, 395
Centralization, 267–268
   authentication component, 478
   backups, 461–462
   balance, 269
   big, honkin&rsquo; file servers, 273–274
   candidates, 271–274
   clearly defining problem, 268
   commodity and, 269
   consolidating enterprise, 272–273
   consolidating purchasing, 276–278
   consolidating services into fewer hosts, 272
   cost savings, 271
   customer&rsquo;s concerns, 270
   customization, 269
   distributed systems management, 271
   economies of scale, 271
   increased purchasing power, 273
   infrastructure decisions, 273–274
   issues similar to new service, 270
   judgment calls, 269–270
   management decisions or politics, 270–271
   motivations for change, 268
   outsourcing, 278–281
   PC purchasing process, 277–278
   printing, 274, 426–427
   remote access, 478
   single points of failure, 276
   special features, 269
   system administration, 272
   unrealistic promises, 269
Centralized funding model, 530–531
Centralized models, 536–537
CEO, 111, 118, 180, 399, 459, 535, 544, 584, 648, 706, 715
CERT/CC web site, 132
CFO (chief financial officer), 532
CGI (Common Gateway Interface), 26
Change control, 106
Change management, 195, 629–630
   automated checks, 198–200
   automated front-ends for, 205
   automation, 196
   categories, 196
   change proposal forms, 204
   communication, 196
   critical machines, 204
   daily meetings, 207–208
   documentation, 196, 204–205
   identities attached to changes, 197
   large-scale events succeeding, 209
   locking mechanism, 197
   major updates, 202
   meetings, 207–208
   no changes on Friday, 203
   preparation, 196
   process, 204–205
   quiet times, 205
   revision control, 195–196
   revision history, 197–198
   routine updates, 201–202
   scheduling, 196, 201–204
   sensitive updates, 202
   significant changes, 196
   streamlining process, 210
   structure of communication, 200–201
   technical issues, 196–197
Change procedures and namespaces, 112
Change proposal forms, 204
CHANGELOG file, 226
Chaos topologies, 379
Cheswick, Bill, 379
CIAC (Computer Incident Advisory Capability) web site, 132
CIFS (Common Internet File System), 444, 499–500, 631
CIO (Chief Information Officer), 109, 532, 715
Cisco, 209
Classifier, 305–306
"Clean Desk Policy", 156
Clients and backups, 35
Cloning hard disks, 11–12
Closed source products, 138
Coaching, 617–618
Co-location (co-lo) center, 539
Color-coding cables, 349, 354
CommVault, 444
Commercial software
   licenses, 502
   software depots, 502–503
Communication
   active listening, 583–586
   change management, 196
   change management structure, 200–201
   data centers, 356
   email, 417–418
   I statements, 582–583
   with management and customers, 622
   mirroring, 583–584
   with nontechnical managers, 640–641
   reflection, 585–586
   removing roadblocks, 608–609
   service conversions, 257
   skills, 581–586
   standardizing on certain phrases, 584
   summary statements, 584–585
   technical issues, 581
Company mergers, 715–716
Compartmentalization, 274
Competitive advantage, 632
Competitors and security, 154
Complex host routing, 394
Compliments, 594–595
Components, hot-swappable, 43–44
Comprehensive system testing, 245–246
Computer Security Incident Handling: Step-by-Step booklet, 148
Computer-related crime, 182–186
Computers
   business desktop, 32
   crashing, 716–717
   home line, 32
   large influx of, 720
   retiring, 6
   server line, 32
   verifying contents of, 215
Confidential information, 122
Configuration files
   automated checks, 198–200
   master copies, 114
Configurations, variety of standard, 25–26
Consistency policy and namespaces, 110–111
Console servers, 37, 356–358, 553
Console service maintenance windows, 242
Consolidating
   enterprise, 272–273
   purchasing, 276–278
   services into fewer hosts, 272
Constant professional development, 586–587
Consultants, 540–541
Containment, 23
Contractors, 540–541
Controlled model selection, 33–34
Convenience and security, 119
Conversions, 263
Converting physical network, 261
COO (chief operating officer), 532
Copy Exact, 97
Copyright adherence, 181–182
Corporate
   backups and guidelines, 447–448
   culture and naming policy, 104
   guidelines, 443
   machines, 160
Correct tools for debugging, 83–86
Cost centers, 529
Costs, decreasing, 723–724
Craft worker, 313–314
Crashing computers, 716–717
Cricket, 39
Critical DNS server upgrade, 227–228
Critical host, 34
Criticism, 597
Cross-functional security teams
   business applications support team, 153
   field offices, 153–154
   legal department, 151–152
   product development group, 153
   system administrators, 152–153
CTO (Chief Technology Officer), 531–532
Curtin, Matt, 157, 419
Custom solutions, 632
Customer care, 301–302
   greeting, 304–305
   perils of skipping step, 315–317
   planning and execution, 310–313
   problem identification, 305–310
   reporting problems, 305
   Seinfeld-esque names, 315–317
   single point of contact, 317–318
   ticket tracking software, 304
   trend analysis, 319–310
   verification, 313–315
Customer support, 536–537, 704
   marketing-driven, 307
   system administration team, 533
Customers
   advocates for, 700
   better educated, 320–321
   building confidence, 725
   as craft worker, 313
   data restore needs, 446
   dependency check, 215
   digging into problem, 80
   expectations and aligning priorities, 553–554
   generating most tickets, 319
   good first impression, 548–552
   increasing familiarity with, 318
   making more self-sufficient, 289
   meetings with single point of contact, 648–650
   not good at expressing themselves, 80
   obtaining necessary information from, 307–309
   problems, 80–81
   recommendations for hiring, 656–657
   relationship with support people, 536
   requirements, 622
   resentment toward, 552
   and SAs, 551–553
   unhappy, 719
   venting about, 552–553
   verification/closing, 314–315
Customer/SAs, 703–704
Customization and decentralization, 275
CVS (Concurrent Versions System), 106

D

Daemons, 67
Daily planning, 574–575
Daily tasks, 577
Damage limitation, 166–167
Data. See also Backups; Security
   backups, 35
   integrity, 169–170
   security, 118
Data cables, 352–354
Data centers, 62, 325
   access, 328–330
   ATS, 331
   biometric locks, 330
   bunkers, 328
   bypass for UPS, 332
   cables, 348–354
   Christine's dream, 368–370
   communication, 356
   communication loss, 327
   console servers, 356–358
   delivery dock, 329
   different sources of power, 362–363
   earthquake zone, 328
   extra capacity for, 336–338
   extra electrical capacity, 336–337
   fire suppression, 340–341
   generator for backup power, 331
   generators, 333
   greater redundancy, 362–363
   heat sensors, 335
   high-reliability, 363–364
   historical perspective, 326
   hot spots, 335
   humidity control, 331
   HVAC systems, 334
   ideal, 364–370
   keyboards, 357
   labeling, 354–356
   lighting protection, 328
   MDF, 389–390
   monitors, 357
   more space for, 364
   moving, 714–715
   natural disasters, 327
   overhead power, 338
   parking spaces, 361–362
   PDU, 339–340
   picking location, 327–328
   power and air, 331–340
   power distribution, 338–340
   power outages, 332
   power outlets, 338–339
   preparing for power loss, 327
   proximity badges, 330
   racks, 341–348
   raised floors, 339
   restricting access, 329–330
   security, 329–331
   telecommunications industry, 363–364
   Tom's dream, 365–368
   tools and supplies, 359–361
   UPS, 331–333
   wasted space, 344
   water sensors, 339
   wiring, 348–354
   workbench, 359
Data collection, historical, 399, 509
Database administrator role, 133
Databases
   backups, 467
   high availability, 467–468
Debugging, 79
   better tools, 86
   correct tools for, 83–86
   end-to-end understanding of system, 87–88
   finding problem's cause and fixing it, 81–82
   fixing things once, 92–94
   follow-the-path, 82
   formal training on tools, 87
   latency problems, 85–86
   learning customer's problem, 80–81
   leveraging what others have done, 96–98
   NFS mounting problems, 84
   optimization, 82
   permanent fixes, 94, 96
   process of elimination, 82, 85
   recent changes and, 82
   RPC-based protocols, 84
   short-cuts, 82
   simple host routing, 394
   simple tools, 84–86
   successive refinement, 82
   TCP-based protocols, 85
   temporary fix trap, 94–96
   training, 83
   UNIX systems, 83
   Windows NT, 83
Decentralization, 267–268
   balance, 269
   business motivation, 275
   candidates, 274–276
   clearly defining problem, 268
   customer's concerns, 270
   customization, 275
   democratizing control, 274
   diversity in systems, 276
   duplication of effort, 274
   fault tolerance, 274–275
   first impressions, 270
   issues similar to new service, 270
   judgment calls, 269–270
   many single points of failure, 276
   meeting customer's needs, 275–276
   motivations for change, 268
   political motivation, 275
   trading efficiency for something more valuable, 274
   unrealistic promises, 269
Decentralized funding model, 530–531
Decentralized models, 536–537
Decisions
   explaining apparent, contrary to direction, 616
   precompiling, 577–579
   procrastination, 578
   schedule for changing tapes, 577–578
   taking organizer along, 578
Dedicated network devices, 395–396
Dedicated network router, 41
Delegation, 617
Deleting
   accounts, 106
   files, 97
Demarcation points, 391
Dependency chains, 519–520
Depot for software, 492
Design team, 533
Desktop workstations
   email clients, 409
   large quantities, 4
   loading system software and applications initially, 8–14
   long life cycles, 4
   managing operating systems, 3–6
   network configuration, 17–21
   OSs, 7–24
   updating system software and applications, 14–17
   variety of standard configurations, 25–26
   wiring, 388–389
Developer's toolchain, 504
Devices
   general-purpose, 395–396
   monitoring, 516
   naming standards, 392–393
Devices Control Panel, 96
DHCP (Dynamic Host Configuration Protocol), 18–24
   benefits, 18
   dynamic DNS, 21–24
   dynamic leases, 20
   managing lease times, 23
   moving clients away from resources, 23–24
   public networks, 20–21
   templates rather than per-host configuration, 18–19
Diagnostic tools
   minimal tools, 84–85
   sophisticated tools, 84
   understanding, 83–84
Diameter, 108
Diff command, 217, 314
Difficult-to-type names, 105
Dilbert Check, The, 660
Disaster, 164
Disaster recovery plan, 163
   damage limitation, 166–167
   data integrity, 169–170
   legal obligations, 166
   media relations, 171–172
   preparation, 167–169
   redundant site, 170
   replacement hardware, 168
   restoring services, 168
   risk analysis, 164–165
   risk-taking, 164
   security disasters, 171
   security zones, 170–171
   site location, 168
Disaster worriers, 698
Disk failure, 443, 445
Disposable servers, 45
Distribution-server model, 489
DNS (Domain Name Service), 51
   appliances, 41
   authenticating updates, 22
Document retention policy, 420
Documentation, 11
   change management, 196, 204–205
   email, 418
   helpdesks staff processes, 292
   how to print, 432–433
   labeling, 392
   list of printers, 433
   monitoring, 515–516
   networks, 391–393
   printer labels, 433
   printing, 432–433
   reasons for, 717–718
   restores, 458–459
   security policies, 124–131
Domains
   multiple administrative, 401–402
   single administrative, 399–400
Doohan, James, 220
DoS (denial-of-service) attacks, 150
Downloading software, 181
Dress rehearsal, 225–226
Dukhovni, Viktor, 12
Dumpster diving, 184–185
Dynamic DNS, 21–24
Dynamic IP addresses, 20
Dynamic leases, 20

E

EAP (employee assistance program), 597
E-commerce sites
   duplicating customer's environment, 310
   security, 160
EDA (electronic design automation) company, 152
Educators, 696
EIA/TIA (Electronic Industry Association/Telecommunications Industry Association), 347
80/20 rule, 451
Einstein, Albert, 570
Eircom, 355–356
Email, 405, 536
   acceptable use policy, 418
   advanced monitoring, 420
   alerts, 511
   automation, 412–413
   backup policies, 417, 419–420
   basic level of monitoring, 413
   bounced, 95
   client protocols, 411
   communication, 417–418
   delivering in several possible places, 410
   designing for reliability, 74–75
   document retention policy, 420
   documentation, 418
   encryption, 418–419
   failure, 408
   filtering, 576
   firewalls, 417
   first.last address style, 407
   forwarding, 188, 412
   full redundancy, 74
   gateways, 410
   generality, 411–412
   high-volume list processing, 420–421
   implementing and managing email lists, 410
   large bursts of traffic, 415
   logs, 413
   mail delivery, 409
   mail transport, 409
   mailing list processing, 409
   message sizes, 416
   monitoring, 187
   MTA records, 409, 631
   MUA records, 409
   MX records, 75, 414
   namespaces, 406–408
   nonstandard protocols, 411–412
   number of messages per person, 414
   number of users, 414–415
   open protocols, 411
   periodic automatic checks of active accounts, 412
   postmaster address, 413
   privacy policy, 406, 417
   proprietary software, 58–59
   reading someone else's email, 188–189
   recovery plan in case of failure, 414
   redundancy, 413–414
   reliability, 408–409
   removing accounts, 412
   retention policy, 448
   risks associated with, 418
   scaling, 414–416
   security, 416–417
   simple, clear, well-documented architecture, 406
   simplicity, 409–410
   size of messages, 414
   SMTP-based protocol, 411
   spare mail spool space, 416
   taking risks with, 408
   touching only once, 576
   traffic levels, 414
   translation devices, 410
   viruses, 417
Email addresses, 111, 407
Email appliances, 41
Email clients, 409
   automatically checking for messages, 415
   commercial encryption packages, 419
   load-balancing switches, 414
   operation, 414
   redundancy, 414
   VRRP, 414
Email hosts, 414
Email servers, 413
Emergency use of shutdown sequence, 242
Employees, 623–628
   expectations on privacy, 187
   lack of information, 625
   mistakes and, 624
   reprimands, 624
   retention, 670–672
   treating with respect, 623–626
Employment, looking for new, 723
Encrypted tunnels, 396
Encryption and email, 418–419
End-to-end experts, 707
End-to-end testing, 517–518
End-to-end understanding of system, 87–88
Enforcing company policy, 613–615
Equipment labeling, 354–355
Error messages and alerts, 512–513
Escalation, 292–293
/etc/ethers file, 19
/etc/hosts file, 19
/etc/motd file, 106, 225
/etc/ntp.conf file, 15
/etc/passwd file, 106
/etc/sh

Updates

Submit Errata

More Information

Unlimited one-month access with your purchase
Free Safari Membership