Don't panic if your lightweight cloud server's bandwidth is fully utilized; first, figure out whether it's a "good thing" or a bad thing-Jtti

Don't panic if your lightweight cloud server's bandwidth is fully utilized; first, figure out whether it's a "good thing" or a bad thing

Time : 2026-03-17 11:44:06

Edit : Jtti

Having full bandwidth usage isn't necessarily a bad thing, nor is it always an attack. The key is knowing how to judge, locate, and handle it. Let's first discuss why bandwidth might be full. The most common reason is actually a "good thing"—a sudden surge in business activity. Japanese servers are frequently used for game acceleration, video distribution, and cross-border e-commerce. During promotional events, version updates, or the release of trending content, traffic spikes instantly. This traffic has characteristics: concentrated request paths, relatively fixed sources, and normal access behavior. You can check the access logs to see actual page requests and download records.

Another possibility is "false bandwidth fullness"—caused by network quality issues. Japanese servers serve multiple regions, and if an international link fluctuates, TCP retransmissions will increase significantly. Monitoring will show a sudden spike in bandwidth, but much of it is duplicate data, not real traffic. This "false surge" is especially common during peak evening hours when cross-border links are congested.

Of course, we can't rule out a genuine attack. UDP floods and reflection amplification attacks can indeed clog your exit points with invalid data packets in a short period. The characteristics of this situation are quite obvious: a large number of new connections fail, normal users can hardly connect, and the system load is completely disproportionate to the access volume.

So the question is: how to quickly determine which situation it is?

First, check the traffic source. Use iftop to check real-time bandwidth usage:

iftop -i eth0

This tool can directly show which IPs are currently communicating and how much bandwidth they are consuming. If the source IPs are scattered all over the world, and you can't recall any business targeting these regions, then be wary.

Second, check the connection status. Use ss to count the number of connections in various TCP states:

ss -ant | awk '{print $1}' | sort | uniq -c

If the number of connections in the SYN_RECV state is abnormally high, it may be a SYN Flood attack; if the number of TIME_WAIT connections is outrageously high, it may be a connection overload, but not necessarily an attack.

Third, capture packets and examine the content. If you're still unsure, use tcpdump to capture and analyze the traffic:

tcpdump -i eth0 -nn 'tcp[tcpflags] & (tcp-syn) != 0' -c 1000

This command captures 1000 SYN packets to check if the source IP is fake or if it's scanning ports. For normal business requests, packet load is meaningful; attack traffic often consists of a large number of empty packets or a fixed pattern.

Once you've determined whether it's normal traffic or an attack, it's time to take action.

If it's a surge in normal business traffic, congratulations, this is a "happy problem." But while it's a happy problem, your bandwidth is definitely full, and user access is starting to lag. What do you do?

The short-term solution is to implement rate limiting and traffic shaping. Using the Linux `tc` command, allocate different bandwidth guarantees to different services:

tc qdisc add dev eth0 root handle 1: htb default 10

tc class add dev eth0 parent 1: classid 1:1 htb rate 80mbit ceil 100mbit

tc filter add dev eth0 protocol ip parent 1: prio 1 u32 match ip dst 192.168.1.0/24 flowid 1:10

This configuration means: guarantee 80M bandwidth for a certain IP range, with a maximum usage of 100M. This way, even if the total bandwidth is fully utilized, core services will not be overwhelmed.

The long-term solution is to optimize the architecture. Static resources are placed in a CDN, while images, scripts, and styles all go through edge nodes, and the origin server only handles dynamic requests. Using Japanese nodes in conjunction with services like Cloudflare can significantly reduce the bandwidth required for origin pulls. If your budget allows, upgrading your bandwidth plan or switching to a 95% billing model is more cost-effective for businesses like e-commerce, which experience stable performance during normal times followed by surges during peak sales periods.

If it's an attack, the approach is different.

The simplest solution is to temporarily block suspicious IPs:

`iptables -A INPUT -s suspiciousIP -j DROP`

If the attack sources are too dispersed to block individually, you can enable SYN Cookie protection:

sysctl -w net.ipv4.tcp_syncookies=1

This can mitigate SYN Flood attacks.

Furthermore, use the `limit` module in iptables for rate limiting:

iptables -A INPUT -p tcp --dport 80 -m limit --limit 25/minute --limit-burst 100 -j ACCEPT

iptables -A INPUT -p tcp --dport 80 -j DROP

This means accepting a maximum of 25 new connections per minute, with a burst of no more than 100; any exceeding this will be dropped. While this may inadvertently affect legitimate users, it's better than the entire service crashing.

If the attack traffic is too large to handle, you'll need to rely on upstream providers. Most cloud service providers offer DDoS mitigation services; in emergencies, you can contact them to enable blackholes or traffic redirection. You can also use edge protection like Cloudflare to block large amounts of traffic from reaching the origin server.

Finally, let's talk about prevention.

Instead of waiting until bandwidth is maxed out and then scrambling, it's better to implement monitoring proactively. Use Prometheus with node_exporter to monitor network interface card (NIC) traffic in real time and set reasonable alarm thresholds. For example, trigger an alarm if outbound bandwidth exceeds 80% for 5 consecutive minutes. This way, you'll receive notifications during the day, instead of being woken up in the middle of the night.

It's also recommended to install a lightweight tool like vnStat. It silently stores historical data in the background, making it easy to understand traffic trends when needed:

vnstat -d # View daily traffic for the past 30 days

With historical baselines, you can determine whether "today's traffic is truly abnormal, or if it's a pattern every month on this day."

For lightweight cloud servers in Japan, line quality is also crucial. If users are primarily located in China, choose products with CN2 GIA or AS9929 optimized lines for significantly improved stability during peak evening hours. Ultimately, panicking about full bandwidth usage is useless; you need to know how to analyze, address, and mitigate the situation. Figure out whether it's a positive or negative development, and take appropriate measures to ensure your server can withstand the strain.

Relevant contents

24/7/365 support.We work when you work