Data Recovery for EXT 3 : File System for Linux
The ext3 is a journalled file system that is coming into increasing use among users of the Linux operating system. It is the default file system for the Red Hat, Fedora and Debian Linux distributions.
The Ext3 filesystem consist of two simple concepts:
- A journaling file system
- Compatible as much as possible with the old Ext2 filesystem
Ext3 depends on these concepts very well. It is largely based on Ext2, so its data structures on disk are essentially identical to those of an Ext2 filesystem. In fact, if an Ext3 filesystem has been cleanly unmounted, it can be remounted as an Ext2 filesystem; On the opposite side, creating a journal of an Ext2 filesystem and remounting it as an Ext3 filesystem is a simple, fast operation.
Journaling Filesystems
As disks became larger in size, one design choice of traditional Unix file systems becomes inappropriate. Updates to file system blocks might be kept in dynamic memory for long period of time before being flushed to disk. Event like a power-down failure or a system crash might thus leave the filesystem in an inconsistent state. To overcome this problem, each traditional UNIX filesystem is checked before being mounted; if it has not been properly unmounted, then a specific program executes an exhaustive, time-consuming heck and fixes all file system’s data structures on disk.
Time spent checking the consistency of a filesystem depends mainly on the number of files and directories to be examined. It also depends on the disk size. As file systems reaching hundreds of gigabytes, a single consistency check may take long hours. This downtime is unacceptable for any production environment or high-availability server.
The aim of a journaling filesystem is to avoid running time-consuming consistency Checks on the whole filesystem by looking instead in a special disk area that contains the most recent disk write operations named journal. Remounting a journaling filesystem after a system failure is a matter of few seconds.
Recovering Ext3 File system
There are many data recovery tools for other file systems, but there are few for Ext3, which is currently the default file system for most Linux distributions. Due to the process that Ext3 files are deleted. Critical information that stores where the file content is located is cleared during the deletion process.
At a minimum, the OS must mark each of the blocks, the inode, and the directory entry as unallocated so that later files can use them. This minimal approach is what occurred several years ago with the Ext2 file system. In this case, the data recovery process was relatively simple because the inode still contained the block addresses for the file content and tools such as debugfs and e2undel could easily re-create the file. This worked as long as the blocks had not been allocated to a new file and the original content was not overwritten.
With Ext3, there is an additional step that makes recovery much more difficult. When the blocks are unallocated, the file size and block addresses in the inode are cleared; therefore we can no longer determine where the file content was located.
Data Recovery Approaches
We can examine two approaches to file recovery. The first approach uses the application type of the deleted file and the second approach uses data in the journal.
While recovering after a system failure, the e2fsck program distinguishes the following two cases:
- The system failure occurred before a commit to the journal. Either the copies of the blocks relative to the high-level change are missing from the journal or they are incomplete; in both cases, e2fsck ignores them.
- The system failure occurred after a commit to the journal. The copies of the blocks are valid and e2fsck writes them into the file system.
In the first case, the high-level change to the filesystem is lost, but the filesystem state is still consistent. In the second case, e2fsck applies the whole high-level change, thus fixing any inconsistency due to unfinished I/O data transfers into the filesystem.
Journaling file system ensures consistency only at the system call level. For instance, a system failure that occurs while you are copying a large file by issuing several write() system calls will interrupt the copy operation, thus the duplicated file will be shorter than the original one.
Advantages of ext3
Availability:
Ext3 does not require a file system check, even after an unclean system shutdown, except for certain rare hardware failure cases (e.g. hard drive failures). This is because the data is written to disk in such a way that the file system is always consistent. The time to recover an ext3 file system after an unclean system shutdown does not depend on the size of the file system or the number of files; rather, it depends on the size of the "journal" used to maintain consistency. The default journal size takes about a second to recover with respect to hardware speed.
Data Integrity:
Ext3 file system can provide stronger guarantees about data integrity in case of an unclean system shutdown. It keeps the file system consistent, but allow for damage to data on the file system in the case of unclean system shutdown. this can give a modest speed up under some but not all circumstances. Alternatively, it ensures that the data is consistent with the state of the file system. There will be no garbage data in recently-written files after a crash. The safe choice, keeping the data consistent with the state of the file system, is the default.
Speed:
In spite of writing some data more than once, ext3 has higher throughput than ext2 because ext3's journaling optimizes hard drive head motion. There are three journaling modes to optimize speed, optionally choosing to trade off some data integrity.
- One mode, data=writeback, limits the data integrity guarantees, allowing old data to show up in files after a crash, for a potential increase in speed under some circumstances.
- The second mode, data=ordered (the default mode), guarantees that the data is consistent with the file system; recently-written files will never show up with garbage contents after a crash.
- The last mode, data=journal, requires a larger journal for reasonable speed in most cases and therefore takes longer to recover in case of unclean shutdown, but is sometimes faster for certain database operations.
Easy Transition:
It is easy to switch from ext2 to ext3 and gain the benefits of a robust journaling file system, without reformatting. In order to experience the advantages of ext3 there is no need to do a long, tedious, and error-prone backup-reformat-restore operation.
For data recovery, visit www.optimumrecovery.com