Hello y'all!
So, I'm attempting to use haproxy to load balance an IPv6 listener
with an IPv6 backend. The interesting problem I'm running into is that
I'm able to reliably crash the linux kernel I'm using. Has anyone
else run into a similar issue? (Obviously, this feels like a kernel
bug to me-- a user space program ought not to be able to crash the
kernel. But still, I do wonder if there's something I'm doing which
is particularly wrong in this case.)
The kernel is the Scientific Linux port of the latest RHEL 6.2 kernel:
2.6.32-279.1.1.el6.x86_64
haproxy version I'm experimenting with is 1.5-dev11, built as an rpm
using the haproxy.spec file included with the source.
I've tried this on other 2.6.32 kernels with similar results.
Here's the pertinent portion of the crash log:
BUG: unable to handle kernel paging request at ffffc90737275ab8
IP: [<ffffffffa03108f8>] inet6_csk_search_req+0x48/0x130 [ipv6]
PGD 23feb8067 PUD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
CPU 3
Modules linked in: ip6table_filter ip6_tables xt_comment
iptable_filter ip_tables bonding 8021q garp stp llc ipv6 xfs exportfs
microcode serio_raw sg i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support
ioatdma i7core_edac edac_core igb dca ext4 mbcache jbd2 sr_mod cdrom
sd_mod crc_t10dif ahci 3w_sas dm_mirror dm_region_hash dm_log dm_mod
[last unloaded: scsi_wait_scan]
Pid: 0, comm: swapper Not tainted 2.6.32-279.1.1.el6.x86_64 #1
Supermicro X8DTU/X8DTU
RIP: 0010:[<ffffffffa03108f8>] [<ffffffffa03108f8>]
inet6_csk_search_req+0x48/0x130 [ipv6]
RSP: 0018:ffff88002f663a30 EFLAGS: 00010206
RAX: 00000000e4af6154 RBX: ffff88023774e838 RCX: 00000000ffffffff
RDX: 00000000e4af6156 RSI: 00000000d675f11e RDI: 00000000f7a705c2
RBP: ffff88002f663a70 R08: 0000000062ea86fc R09: 00000000ff7e638b
R10: ffff8802364cd050 R11: 0000000000000000 R12: ffff88023774e848
R13: 0000000000002891 R14: 0000000000000004 R15: ffffc90011ac5000
FS: 0000000000000000(0000) GS:ffff88002f660000(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: ffffc90737275ab8 CR3: 0000000239027000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 0, threadinfo ffff88023afc6000, task ffff88023afc2aa0)
Stack:
ffff88002f663a40 ffff88002f663ab8 ffff88002f663a70 ffff880237750780
<d> ffff8802364cd040 0000000000000000 ffff88023774e858 ffff88023774e830
<d> ffff88002f663b10 ffffffffa0309efd ffffffffa010e060 ffff880239251760
Call Trace:
<IRQ>
[<ffffffffa0309efd>] tcp_v6_do_rcv+0x38d/0x5b0 [ipv6]
[<ffffffffa02f2080>] ? ip6_pol_route_input+0x0/0x20 [ipv6]
[<ffffffffa030bbe0>] tcp_v6_rcv+0x560/0x870 [ipv6]
[<ffffffff814665f9>] ? nf_iterate+0x69/0xb0
[<ffffffffa02e67fa>] ip6_input_finish+0x16a/0x410 [ipv6]
[<ffffffffa02e6af8>] ip6_input+0x58/0x60 [ipv6]
[<ffffffffa02e621f>] ip6_rcv_finish+0x3f/0x50 [ipv6]
[<ffffffffa02e65b8>] ipv6_rcv+0x388/0x460 [ipv6]
[<ffffffff8143a7cb>] __netif_receive_skb+0x49b/0x6f0
[<ffffffff8143ca48>] netif_receive_skb+0x58/0x60
[<ffffffff8143cb50>] napi_skb_finish+0x50/0x70
[<ffffffff8143f089>] napi_gro_receive+0x39/0x50
[<ffffffffa01223b4>] igb_poll+0x864/0xb00 [igb]
[<ffffffff81060456>] ? rebalance_domains+0x1a6/0x5a0
[<ffffffff81096112>] ? enqueue_hrtimer+0x82/0xd0
[<ffffffff8143f1a3>] net_rx_action+0x103/0x2f0
[<ffffffff81073ec1>] __do_softirq+0xc1/0x1e0
[<ffffffff810db810>] ? handle_IRQ_event+0x60/0x170
[<ffffffff8100c24c>] call_softirq+0x1c/0x30
[<ffffffff8100de85>] do_softirq+0x65/0xa0
[<ffffffff81073ca5>] irq_exit+0x85/0x90
[<ffffffff81505b05>] do_IRQ+0x75/0xf0
[<ffffffff8100ba53>] ret_from_intr+0x0/0x11
<EOI>
[<ffffffff812cd8de>] ? intel_idle+0xde/0x170
[<ffffffff812cd8c1>] ? intel_idle+0xc1/0x170
[<ffffffff81407637>] cpuidle_idle_call+0xa7/0x140
[<ffffffff81009e06>] cpu_idle+0xb6/0x110
[<ffffffff814f6cef>] start_secondary+0x22a/0x26d
Code: 08 03 00 00 48 89 cb 41 89 d5 48 89 df 4d 89 c4 41 0f b7 f5 45
89 ce 41 0f b7 4f 14 41 8b 57 10 e8 6e fa ff ff 89 c2 48 83 c2 02 <49>
8b 44 d7 08 48 85 c0 0f 84 86 00 00 00 4d 8d 7c d7 08 eb 09
RIP [<ffffffffa03108f8>] inet6_csk_search_req+0x48/0x130 [ipv6]
RSP <ffff88002f663a30>
CR2: ffffc90737275ab8
And here's the config I'm using:
# Config file for cust44052_http_80_lbs6443
global
log /dev/haproxy-log news
maxconn 50000
user haproxy
group haproxy
daemon
pidfile /var/run/haproxy/haproxy.cust44052_http_80_lbs6443.pid
stats socket /var/lib/haproxy/stats.cust44052_http_80_lbs6443.sock
nosplice
defaults
log global
mode http
option httplog
option dontlognull
option dontlog-normal
retries 3
option redispatch
maxconn 50000
contimeout 5000
clitimeout 50000
srvtimeout 50000
option forwardfor
errorfile 400 /etc/haproxy/errors/400.http
errorfile 403 /etc/haproxy/errors/403.http
errorfile 408 /etc/haproxy/errors/408.http
errorfile 500 /etc/haproxy/errors/500.http
errorfile 502 /etc/haproxy/errors/502.http
errorfile 503 /etc/haproxy/errors/503.http
errorfile 504 /etc/haproxy/errors/504.http
stats enable
stats hide-version
stats uri /bbg_haproxy_stats
stats realm BBG\ Haproxy\ Statistics
stats auth cust44052:i5gCkukscTV7pdpVR
balance roundrobin
option httpclose # disable keep-alive
#source
frontend cust44052_http_80_lbs6443
bind 2607:f700:8001:1b:1234:5678:abcd:beef:80
bind 199.91.168.52:80
acl site_dead nbsrv(default) lt 1
monitor fail if site_dead
default_backend default
backend default
option httpchk GET / HTTP/1.1\r\nHost:\ localhost
server will.c44052 2607:f700:8000:12e:dead:beef:1:449:80 check inter
5000 rise 2 fall 5
I can reliably trigger the crash just by trying to make a TCP
connection to 2607:f700:8001:1b:1234:5678:abcd:beef on port 80. This
does not happen when connecting to the IPv4 bind address above (and in
fact, I get the web response I would expect from the back-end).
Thanks,
Stephen
--
Stephen Balukoff
Blue Box Group, LLC
(800)613-4305 x807