Mandriva

Return to the main archive index.

Custom Search

Mandrake Linux Archives: cooker-amd64@linux-mandrake.com

Mandrake Linux: cooker-amd64@linux-mandrake.com


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]


This was on a dual Opteron 246 with 6GB ECC ram (full specs are here: 
http://archives.mandrakelinux.com/cooker-amd64/2003-11/msg00174.php)

kernel: kernel-smp-2.4.22.27mdk-1-1mdk.amd64.rpm, directly from updates.

The machine was moderately loaded, with a load of about 4.  There were 3 batch 
processes running with about 1.5GB of memory usage each, plus I was editing a 
file in emacs.  I noticed the following message on the terminal:

Message from syslogd@opteron at Mon Feb  2 23:06:25 2004 ...
opteron kernel: Oops: 0000

Message from syslogd@opteron at Mon Feb  2 23:06:25 2004 ...
opteron kernel: RIP [kmem_cache_alloc_batch+86/320] RSP <00000100f9309e68>

Message from syslogd@opteron at Mon Feb  2 23:06:25 2004 ...
opteron kernel: RIP [<ffffffff80146f36>] RSP <00000100f9309e68>

Message from syslogd@opteron at Mon Feb  2 23:06:25 2004 ...
opteron kernel: CR2: 00000103c003a644

and about 10 seconds later the machine had a hard lock.  The machine had just 
been rebooted 5 or so hours before, after several weeks of uptime at similar 
loads.

There were a few strange things going on.  Firstly, the machine was obviously 
swapping hard, despite the fact that it should have had plenty of memory.  I 
have the output of top below, which was running maybe 5 minutes the crash 
(note that it has been sanitised, and was sorted by memory usage, so there 
are no big processes missing.  At that stage, most of the CPU was taken up by 
kswapd, or waiting for the disk to catch up with the swap).  The memory usage 
had been consistently climbing.

It appears that the oops is related to a hard out-of-memory situation.

There have been similar oopses before (I can dig for the details, if someone 
needs them), which have not seemingly affected the stability of the system.

Does anybody have any idea what is going on here?  Or is this a new bug?  Are 
there any newer kernel packages which are likely to work better?

top - 22:57:32 up  5:25, 10 users,  load average: 3.14, 3.19, 2.85
Tasks:  78 total,   1 running,  77 sleeping,   0 stopped,   0 zombie
Cpu(s):   0.3% user,   4.9% system,   2.4% nice,  92.4% idle
Mem:   6020620k total,  6003480k used,    17140k free,     1928k buffers
Swap:  6401860k total,  3451240k used,  2950620k free,    16176k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 4146 jeremy     9   0 1593m 1.0g 475m D  0.6 17.0   1:10.70 bigproc1
 4003 jeremy    18  10 1340m 338m 117m D  4.7  5.8   6:37.31 bigproc2
 4041 jeremy     9   0 1405m 322m  80m D  0.3  5.5  11:08.47 bigproc1
 4147 jeremy     9   0 10780  10m 4764 S  0.0  0.2   0:00.18 emacs
  851 root       9   0  4084 4084 2508 S  0.0  0.1   0:00.34 ntpd
 4148 jeremy    19   0  2536 2536 1964 R  5.1  0.0   0:09.83 top
 1425 jeremy     9   0  1712  940  648 S  0.0  0.0   0:00.03 bash
 1247 jeremy     5 -10  1216  820  608 S  0.0  0.0   0:06.22 xxx
 1551 jeremy     9   0  1500  600  504 S  0.0  0.0   0:00.06 bash
  923 root       9   0  1220  416  328 S  0.0  0.0   0:00.27 cupsd
 1550 jeremy     9   0   844  336  280 S  0.0  0.0   0:00.10 sshd
 1424 jeremy     9   0   836  324  208 S  0.0  0.0   0:00.00 sshd
 1017 lp         9   0   564  316  212 S  0.0  0.0   0:00.40 cups-polld
 1245 root       9   0   348  296  220 S  0.0  0.0   0:00.98 xxx
 1172 root       9   0   152  104   68 S  0.0  0.0   0:00.03 crond
 3233 jeremy     9   0  7036   64    0 S  0.0  0.0   0:02.54 emacs
  773 xfs        9   0  2956   60   32 S  0.0  0.0   0:00.16 xfs
    1 root       8   0   112   56   52 S  0.0  0.0   0:05.62 init
  641 root       9   0   156   56   32 S  0.0  0.0   0:00.12 syslogd
 1548 root       9   0   624   24    0 S  0.0  0.0   0:00.02 sshd
 3215 jeremy     9   0  1496   20    0 S  0.0  0.0   0:00.05 vi
 3138 root       9   0   628   12    0 S  0.0  0.0   0:00.00 sshd
 1508 jeremy     9   0   652    8    0 S  0.0  0.0   0:00.47 sshd
 1422 root       9   0   624    4    0 S  0.0  0.0   0:00.01 sshd
    2 root       9   0     0    0    0 S  0.0  0.0   0:00.05 keventd
    3 root      19  19     0    0    0 S  0.0  0.0   0:00.01 ksoftirqd_CPU0
    4 root      19  19     0    0    0 S  0.0  0.0   0:00.01 ksoftirqd_CPU1
    5 root      11   0     0    0    0 S  2.8  0.0   0:16.67 kswapd
    6 root       9   0     0    0    0 S  0.0  0.0   0:00.00 bdflush
    7 root       9   0     0    0    0 S  0.0  0.0   0:00.70 kupdated
    8 root       9   0     0    0    0 S  0.0  0.0   0:00.03 kinoded
    9 root      -1 -20     0    0    0 S  0.0  0.0   0:00.00 mdrecoveryd
   13 root       9   0     0    0    0 S  0.0  0.0   0:00.01 ahd_dv_0
   14 root       9   0     0    0    0 S  0.0  0.0   0:00.00 ahd_dv_1
   15 root       9   0     0    0    0 S  0.0  0.0   0:00.00 scsi_eh_0
   16 root       9   0     0    0    0 S  0.0  0.0   0:00.00 scsi_eh_1
   20 root       9   0     0    0    0 S  0.0  0.0   0:00.10 kreiserfsd
  114 root       9   0   728    0    0 S  0.0  0.0   0:00.35 devfsd
  199 root       9   0     0    0    0 S  0.0  0.0   0:00.00 khubd
  627 rpc        9   0    96    0    0 S  0.0  0.0   0:00.00 portmap
  649 root       9   0  1008    0    0 S  0.0  0.0   0:00.05 klogd
  694 root       9   0   104    0    0 S  0.0  0.0   0:00.03 rpc.statd
  799 root       9   0     0    0    0 S  0.0  0.0   0:00.55 rpciod
  800 root       9   0     0    0    0 S  0.0  0.0   0:00.00 lockd

Cheers,
Jeremy



Date Index | Thread Index

Looking for a job?



Advertisement (via La Vignette)