- Table of Contents
- .NET Book Recommendations
- Getting Started with .NET
- The Microsoft .NET Framework
- The Common Language Runtime (CLR), the Common Type System (CTS), and the Common Language Specification (CLS)
- .NET Framework Class Library
- Visual Studio .NET
- .NET Enterprise Servers and .NET My Services
- .NET Compliant Languages
- C#
- Visual Basic .NET (VB .NET)
- ASP.NET
- XML Web Services
- ADO.NET
- XML.NET
- Windows Forms
- Why .NET?
- Displaying Errors with the Error Provider
- COM Interoperability
- Comparing Java and .NET
- Calling Unmanaged Code
- .NET Application Security
- Code Access Security
- .NET Standards Support
- Numeric Types in the .NET Framework
- Working with Strings
- Formatting Strings
- Trimming Character Strings
- Comparing Strings in .NET 2.0
- Arrays and Collections
- Arrays as Class Members
- Sorting a Multi-Dimensional Array
- Sorting a Multi-Dimensional Array with LINQ
- File I/O (System.IO)
- Working with File Names
- Using the File System
- Working with Files and Directories
- Monitoring the File System
- Working with Streams
- Working with Text Encodings
- Working with Date and Time
- Extending the DateTime Class
- Using DateTimeOffset
- Fun with Dates
- Exceptions
- Delegates
- Events
- Asynchronous Programming
- Asynchronous File I/O
- Timers
- Random Numbers
- Cryptographically Secure Random Numbers
- Serialization
- MultiThreading (System.Threading)
- Multi-Threading Overview
- The Managed Thread Pool
- Managed Threading
- Thread Synchronization
- Synchronizing Data Access
- Trace Debugging
- Tracing in .NET 2.0
- ASP.NET Trace
- Validating User Input in ASP.NET Web Pages
- Event Logging
- Monitoring Application Performance
- Accessing the Registry
- Accessing Environment Information
- Environment Variables in .NET 2.0
- Managing Windows Forms Applications
- Working with Email
- Working with Graphics
- Animating a Background
- Working with Images
- Drawing Cycloid Curves
- Simulating the Spirograph
- Building International Web Applications
- .NET Compact Framework
- Mobile Web Development with ASP.NET
- Speech Technologies
- Microsoft MapPoint Web Service
- Working with Typed DataSets
- Using Relationships in DataSets
- DataColumn Expressions
- Playing Simple Sounds
- Playing Sounds with .NET 2.0
- Returning an Image in a Web Page
- RSS
- Best Practices Project Structure
- Best Practices Application Blocks
- The Data Access Application Block
- The Exception Management Application Block
- Best Practices — Performance
- Best Practices — Performance and Scalability
- Best Practices - Testing
- Reading the Tea Leaves, 2005
- Predictions: A Look Back at 2005, and a Look Ahead to 2006
- .NET Downloads
- Application Deployment Overview
- Application Deployment — Versioning
- Application Deployment — Version Policy
- Application Deployment — Packaging and Distribution
- .NET Remoting Overview
- A Remoting Demonstration
- Remoting Configuration
- Remoting: Lifetimes and Leases
- Remoting: Other Issues
- Attributes
- Writing Custom Attributes
- Accessing Attributes in Code
- Reflection
- Class Design: Inheritance, Interface, or Composition?
- The TriTryst Game
- Console Applications in .NET 2.0
- New File I/O Methods in .NET 2.0
- Building Projects with MSBuild
- Unmanaged Callbacks in .NET 2.0
- Timer Troubles
- Non-Rectangular Windows Forms
- Windows Forms Transparency
- 10 Things I Hate About Visual Basic
- 10 Things I Hate About C#
- Background Processing with Idle Time
- Scaling Windows Forms
- Reading and Writing Binary Data
- New Memory Management Functions in .NET 2.0
- Compatibility Between .NET 1.1 and .NET 2.0
- Managed Debugging Assistants in .NET 2.0
- XDir: A Program for Viewing Directory Sizes
- The Microsoft.VisualBasic Namespace
- Operator Overloading
- Working with GPS Data
- Hidden Visual Studio Tools
- .NET 3.0
- The .NET 2.0 Stopwatch Class
- Nullable Types
- Drawing Rotated Text
- Unsafe Code
- Other .NET Languages
- Compiler Directives
- Safe Handles
- Predictions, 2007 Edition
- New Features in C# 3.0
- Generics
- Network Client Programming
- On the Misuse of Exceptions
- Maximum Object Size in .NET
- More on Maximum Object Sizes
- Keyed Collection Memory Limitations
- Matching String Endings
- Allocating Small Data Structures
- Grumbling About Limitations
- Some Thoughts on the Nature of What We Do
- Working with Predicates in Collections
- Working with DataReaders
- Outputting XML with XmlWriter
- Writing XML Data
- Working with Compression
- Another Look at Compressed Streams
- Compressing a Very Large File
- Canonical URIs
- Constructing URIs
- Using OneWayAttribute for Remote Calls
- Selecting a Garbage Collector
- Linked List
- Linked List Application - The MRU List
- Auto-implemented Properties in C#
- The HashSet Collection
- Looking Ahead: 2018
- An Experiment in Optimization
- A Larger Integer
- Extension Methods
- Language Integrated Query (LINQ)
- Variable Length Parameter Lists
- The ReaderWriterLockSlim Synchronization Primitive
- Sorting a Text File
- Sorting a Large Text File
- Using ListView with Large Data Sets
- LINQ One-Liners
- Regular Expression Optimization
- Random File I/O
- Computing the Size of a Structure
- More on Computing Structure Sizes
- UnmanagedMemoryStream
- Dynamically Loading Code
- Building a String Table
- Delegates Versus Function Pointers
- Visual Studio Editor Features
- A Simple Profile Timer
- New Features in C# 4.0
- IEnumerator or IList?
- New Features in .NET 4.0
- Set Operations with IEnumerable and HashSet
- Using File Locks
- Extending Object Functionality
- Clearing a HashSet
- When Hash Codes Matter
- Parsing Command Line Options
- Creating a Single-Instance Program
- Asynchronous Windows Forms Events
- The BackgroundWorker Component
- Fixing a Dumb Mistake
- Thinking About Multi-Threaded Programs
- JavaScript Object Notation
- Better JSON Processing with JSON.Net
- Useful .NET-related Sites
- Markov Models
- Building an Order 0 Markov Model
- Higher Order Markov Models
- Webmaster's Guide to robots.txt
- An Overview of the Parallel Extensions to .NET
- Parallel Extensions Synchronization Objects
- Thread Safe Collections
- A Bug and a Conundrum
- Another Bug and an Answer
- Task Parallel Library
- Good and Bad Ideas in C#
- Parallel LINQ
- Copying Large Files
- Replacing File.Copy
- Learning from Our Mistakes
- Symbolic Links
- There Is No Easy Fix
- Tracking Hurricanes
- Examining Hurricane Data
- Searching for Multiple Strings
- Simple JSON Processing
- Aho-Corasick String Searching
- Writing a Web Crawler
- Web Crawler Politeness
- Source Control Management
- Subversion
- Communicating with Datagrams
- Fun with Actions and Funcs
- The Future of Media
- The Importance of Metadata
- Of Comparison and IComparer
- IComparer, Comparer, IComparable, Oh My!
- Comparing Generic Types
- A Simple HTTP Server
- Quantizing DateTime Fields
- More Fun with the Garbage Collector
- Refactor, Don't Rewrite
- A Generic BinaryHeap Class
- A Generic File Sorter
- Birthdays, Random Numbers, and Hash Keys
- Random Selection from Large Groups
- Command Line Tools for Windows
- Reading and Writing, Bit by Bit
- Selecting the Top N Items from a Group
- Determining Website Content Encoding
- Benefits and Drawbacks of Syndication
- Pubsubhubbub
- Memory Use Misconceptions
- Risk, Lost Opportunity, and Other Hidden Upgrade Costs
- Culture Shock: from .NET to JavaScript
- Using .NET for a Startup
- Tracking Wikipedia Changes with IRC
- Browser Applications and the Same Origin Policy
- Handling the Unexpected
- Dealing with Growth
- Deleting the Oldest File
- Where Do I Put Stuff?
- .NET Timer Resolution
- Exploring Options for Better Timers
- Using the Windows Timer Queue API
- Locks Aren't Slow
- Alternatives to Locks
- Lock Free Concurrent Collections
- The BlockingCollection Class
- Customizing BlockingCollection
- What Time Is It? Daylight Saving Time and Computers
- Using enums to Save Memory
- New File Operations in .NET 4.0
- Building a Hierarchy of Rectangles
- A Faster File Copy
- Constants Are Forever
- The Dangers of Floating Point
- Goto is Not Inherently Evil
- The Weakest Link
- Reducing Memory Required for Strings
- Grouping with LINQ
- HttpListener "Gotchas"
- Extension Methods Are Evil
- Finding the Registered Domain in a URL
- Drawing Text
- Obfuscating Sequential Keys
- Properties of Obfuscated Keys
- Finding Changes Between Two Lists
- Using the ConcurrentBag Collection
- Never Sleep!
- Shuffling and Sorting
- Viewing Large Text Files
- Use the Right Tool
- Why GetHashCode Matters
- Optimization Guidelines
- Timer Differences
- The Mutex
- Modifying a Working System
- Building a New Type of Stream
- More Large File Problems
- A Better File.Copy Replacement
- Throwing the Wrong Exception
- Approximate Counters
- Monitoring a Timer
- Combining Consoles and Forms
- Embedding a Text Resource
- Handling Concurrent Downloads
- The Importance of Domain Knowledge
- Stupid Programmer Tricks
- Aho-Corasick Revisited
- Expressiveness is the Soul of Brevity
- Fun with Anonymous Types
- Simplifying a Multi-Threaded Application
- Work Smarter
- The Skip List Data Structure
- A More Memory-Efficient Skip List
- Selection Revisited
- Why Async?
- What the Future Holds
- The "Roslyn" CTP
- Where We've Been
- Informit Reference Library
Sorting a Large Text File
Last updated Jun 6, 2008.
In the previous section, I showed how to sort a text file that you could read entirely into memory. As useful as that is, often you’ll encounter a file that is larger than your available memory. This is especially true if you’re running on a 32-bit system because those systems are limited to an absolute maximum of 4 gigabytes of virtual memory, and the memory available to user programs is almost certainly less: probably 3 gigabytes. How, then, do you sort a 20 gigabyte file?



