What are the means to optimize file operation performance in Linux servers?-Jtti

What are the means to optimize file operation performance in Linux servers?

Time : 2025-06-23 14:59:27

Edit : Jtti

In Linux systems, the efficiency of file operations directly affects system performance, especially when processing massive amounts of data. By deeply understanding the file system mechanism and kernel features, the speed of file creation and deletion can be increased by dozens of times. Especially in a production environment, performance optimization can effectively improve work efficiency. File operations in Linux servers involve multiple levels, such as system call efficiency, file system features, hardware interaction, etc. It is necessary to understand the kernel mechanism and then go deep into the parameter optimization of specific commands to optimize massive small files or large files respectively. Deletion operations also need to pay extra attention to the impact of ext4 mechanisms on performance, etc. For more specific content, please continue reading!

Optimize file creation performance

Fragmentation can be reduced through pre-allocation strategy:

fallocate -l 2G large_file.img # Completed in 0.1 seconds
dd if=/dev/zero of=file bs=1G count=2 # Takes 3 seconds (mechanical hard disk)
- fallocate directly allocates disk blocks (metadata operation only)
- dd requires physical write, 200 times slower
- Ext4/XFS supports instant pre-allocation, NTFS requires full write

Batch processing reduces system calls. The specific operations are as follows:

python
# Python efficient creation example
with open('batch.txt', 'w') as f:
for i in range(100000):
f.write(f"Line {i}\n") # Single open completes 100,000 lines of writing
- Single open call is 300 times faster than loop open
- Buffer setting: setvbuf() is adjusted to 1MB to improve the writing speed of small files

Common file system feature utilizations include:

- XFS's delayed allocation mechanism: merge multiple write requests
- Btrfs's copy-on-write: avoid duplicate data writing
- Tmpfs memory file system: creation speed is 100 times faster than SSD
mount -t tmpfs -o size=1G tmpfs /mnt/ramdisk

The following is a collection of commonly used commands for deep optimization of file deletion!

1. Asynchronous deletion mechanism

rsync -a --delete empty_dir/ target_dir/ # 5 times faster than rm
Principle: Replace the original directory after creating an empty directory structure

Applicable scenario: Deletion of millions of small files

2. Kernel parameter tuning

sysctl -w vm.vfs_cache_pressure=200 # Increase inode recycling priority
echo 3 > /proc/sys/vm/drop_caches # Release pagecache immediately

Adjust dir_index to enable B-tree index (Ext4):

tune2fs -O dir_index /dev/sda1

3. Physical storage feature adaptation

Mechanical hard disk: Enable TRIM to prevent deletion performance degradation

fstrim -v /mnt/data # Execute weekly
- SSD: Disable log to reduce write amplification
mkfs.ext4 -O ^has_journal /dev/nvme0n1p1

High concurrency scenario practice

1. Parallel processing framework

# Use GNU parallel to delete millions of files
find /data/2023- -type f | parallel -j 32 rm {}
- -j parameter is set according to the number of CPU cores (recommended number of cores × 2)

2. Comparison of efficient tool chains

Operation rm rsync find -delete unlink
Time taken for 100,000 files 82s 18s 76s 79s
Memory usage (MB) 15 220 18 12
Number of system calls 300,000+ 50,000 280,000+ 300,000+

3. Solutions for extreme scenarios

Prevention of inode exhaustion:

df -i # Monitor inode usage
tune2fs -i 0 /dev/sdb1 # Disable time check

Zombie file processing:

lsof +L1 # Find deleted files occupied by processes
kill -9 $(pid) # Free up space

Safety and reliability, such as complete deletion of sensitive data

Mechanical hard disk: 7 overwrites (DOD 5220.22-M standard)

shred -n 7 -z confidential.doc

SSD: Use the manufacturer's secure erase tool

nvme format -s1 /dev/nvme0n1

2. Anti-accidental deletion technology

alias rm='rm -I' # Deleting more than 3 files requires confirmation
chattr +i critical_file # Add a non-deletable mark

3. Automatic Recycle Bin

# Custom deletion function
del() {
mv "$@" ~/.Trash/$(date +%Y%m%d)
}

Performance benchmark test data

Creation speed comparison (10,000 1KB files)

- Ext4 default: 12.8 seconds
- XFS+fallocate: 0.3 seconds
- Tmpfs: 0.1 seconds
- Deletion speed comparison (1 million empty files)
- rm -rf: 182 seconds
- rsync: 31 seconds
- parallel rm: 28 seconds

Production environment recommendations: Avoid using tmpfs for database servers, and give priority to XFS file systems. The rsync deletion solution is recommended for log servers, and logrotate is used to achieve automated management. Before deleting critical data, the backup must be verified and the 3-2-1 principle (3 copies, 2 media, 1 copy off-site) must be followed. When processing more than 100 million files, consider using a distributed file system (such as Ceph) instead of a single-machine solution. Monitor IO load through iostat -xmt 2 to ensure that the deletion operation does not affect the response time of core services.

Relevant contents

24/7/365 support.We work when you work