Aqui está o meu free
e smem
saída:
danslimmon@bad-server:~$ free -m
total used free shared buffers cached
Mem: 30147 29928 218 6 4 3086
-/+ buffers/cache: 26837 3309
Swap: 0 0 0
danslimmon@bad-server:~$ smem -tw
Area Used Cache Noncache
firmware/hardware 0 0 0
kernel image 0 0 0
kernel dynamic memory 12857576 2887440 9970136
userspace memory 17661400 1272468 16388932
free memory 351592 351592 0
----------------------------------------------------------
30870568 4511500 26359068
E aqui está o chefe da minha top
saída, classificado por RSS:
top - 15:51:13 up 248 days, 14:20, 1 user, load average: 14.43, 11.00, 8.95
Tasks: 510 total, 2 running, 508 sleeping, 0 stopped, 0 zombie
%Cpu(s): 30.6 us, 3.8 sy, 0.9 ni, 63.8 id, 0.1 wa, 0.0 hi, 0.5 si, 0.2 st
KiB Mem: 30870568 total, 30469188 used, 401380 free, 4364 buffers
KiB Swap: 0 total, 0 used, 0 free. 2994052 cached Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
41801 cassand+ 20 0 17.501g 7.845g 757184 S 428.9 26.6 181039:29 java
73105 redacte+ 20 0 8401132 6.181g 3684 S 251.1 21.0 11336:12 beam.smp
105293 nobody 20 0 3115584 2.060g 2284 S 97.2 7.0 28:38.51 statsd
1743 opscent+ 20 0 3347672 172816 1460 S 0.0 0.6 383:20.64 java
73294 redacte+ 30 10 70056 58952 988 S 0.3 0.2 13:39.12 consumer:00237
73279 redacte+ 30 10 68052 56916 1008 S 1.0 0.2 47:04.69 consumer:00226
73281 redacte+ 30 10 67552 56464 1012 S 1.7 0.2 61:14.90 consumer:00230
73304 redacte+ 30 10 65512 54404 984 S 0.7 0.2 37:46.67 consumer:00210
73305 redacte+ 30 10 64640 53576 988 S 1.7 0.2 73:32.57 consumer:00228
73278 redacte+ 30 10 64540 53504 1024 S 1.3 0.2 32:16.44 consumer:00212
73308 redacte+ 30 10 64452 53392 1056 S 0.7 0.2 34:27.21 consumer:00220
73287 redacte+ 30 10 64128 53016 1004 S 1.3 0.2 70:54.29 consumer:00218
73300 redacte+ 30 10 64024 52828 984 S 0.7 0.2 33:05.53 consumer:00207
73299 redacte+ 30 10 63744 52680 984 S 1.3 0.2 40:28.91 consumer:00209
73302 redacte+ 30 10 62840 51812 1028 S 1.3 0.2 45:07.17 consumer:00210
73288 redacte+ 30 10 62268 51240 1068 S 1.0 0.2 46:46.53 consumer:00209
73297 redacte+ 30 10 62988 50924 976 S 0.7 0.2 34:03.82 consumer:00203
73296 redacte+ 30 10 62024 50912 984 S 0.7 0.2 41:08.47 consumer:00205
73280 redacte+ 30 10 61748 50588 956 S 0.7 0.2 35:50.30 consumer:00203
73303 redacte+ 30 10 60632 49564 976 S 1.3 0.2 56:31.81 consumer:00182
73290 redacte+ 30 10 60512 49440 980 S 0.7 0.2 41:46.82 consumer:00195
73283 redacte+ 30 10 60444 49356 992 S 1.3 0.2 52:56.75 consumer:00196
73289 redacte+ 30 10 60328 49196 944 S 0.7 0.2 20:24.01 consumer:00189
73291 redacte+ 30 10 60164 49004 1000 S 1.3 0.2 62:30.71 consumer:00202
73282 redacte+ 30 10 59960 48876 980 S 0.7 0.2 34:53.59 consumer:00191
73293 redacte+ 30 10 59684 48512 972 S 0.7 0.2 33:04.45 consumer:00204
73277 redacte+ 30 10 58736 47628 1000 S 1.3 0.2 40:08.13 consumer:00183
73285 redacte+ 30 10 58552 47388 1012 S 0.7 0.2 35:10.61 consumer:00190
73292 redacte+ 30 10 57676 46476 980 S 0.3 0.2 22:59.14 consumer:00185
73306 redacte+ 30 10 55792 44716 988 S 1.0 0.1 21:42.18 consumer:00200
73301 redacte+ 30 10 55744 44696 1012 S 0.0 0.1 11:05.04 consumer:00194
73298 redacte+ 30 10 55128 43972 988 S 0.3 0.1 38:29.62 consumer:00187
73286 redacte+ 30 10 55024 43904 980 S 1.0 0.1 31:53.58 consumer:00170
73295 redacte+ 30 10 53276 42156 1008 S 0.3 0.1 18:50.26 consumer:00172
73307 redacte+ 30 10 52960 41884 1004 S 0.7 0.1 25:10.44 consumer:00169
73284 redacte+ 30 10 52492 41464 1024 S 0.3 0.1 25:27.32 consumer:00167
98875 root 20 0 1034604 18088 1248 S 0.3 0.1 80:35.73 log-courier
25696 root 20 0 779288 12232 1144 S 0.0 0.0 1304:55 collectd
2073 root 20 0 60840 12092 1504 S 0.0 0.0 61:52.72 supervisord
1255 root 20 0 51436 9844 1032 S 0.0 0.0 16:55.76 munin-node
87724 root 0 -20 20936 8664 3508 S 0.0 0.0 0:05.79 atop
2149 nobody 20 0 45352 7424 1624 S 0.0 0.0 13:49.30 consumer_probe
16973 www-data 20 0 139148 5896 1536 S 0.0 0.0 874:07.04 nginx
16974 www-data 20 0 139104 5880 1544 S 0.7 0.0 869:08.26 nginx
16975 www-data 20 0 139148 5880 1532 R 8.6 0.0 880:08.30 nginx
16972 www-data 20 0 139152 5756 1532 S 0.0 0.0 869:41.00 nginx
1561 ds-agent 20 0 22336 5628 1004 S 0.0 0.0 87:27.22 datastax_agent_
90639 syslog 20 0 354552 4364 676 S 0.0 0.0 24:13.76 rsyslogd
9887 root 20 0 135816 4300 1296 S 0.0 0.0 0:00.01 nginx
101932 danslim+ 20 0 21332 3800 1752 S 0.0 0.0 0:00.08 bash
101802 root 20 0 105632 3568 2580 S 0.0 0.0 0:00.01 sshd
2065 snmp 20 0 45580 3540 764 S 0.0 0.0 171:34.48 snmpd
130366 cassand+ 20 0 21120 3140 1304 S 0.0 0.0 0:00.03 bash
130349 danslim+ 20 0 21224 3020 1140 S 0.0 0.0 0:00.02 bash
1087 root 20 0 10224 2884 600 S 0.0 0.0 0:08.15 dhclient
1 root 20 0 33648 2216 676 S 0.0 0.0 0:24.25 init
111772 danslim+ 20 0 24072 2080 1156 R 0.3 0.0 0:00.11 top
101931 danslim+ 20 0 105780 1756 752 S 0.3 0.0 0:00.16 sshd
79834 postfix 20 0 40468 1716 852 S 0.0 0.0 0:32.20 tlsmgr
3778 ntp 20 0 31444 1644 1044 S 0.0 0.0 14:21.01 ntpd
85449 root 20 0 59640 1568 1112 S 0.0 0.0 0:00.00 cron
85592 root 20 0 59640 1568 1112 S 0.0 0.0 0:00.00 cron
85740 root 20 0 59640 1568 1112 S 0.0 0.0 0:00.00 cron
85888 root 20 0 59640 1568 1112 S 0.0 0.0 0:00.00 cron
86041 root 20 0 59640 1568 1112 S 0.0 0.0 0:00.00 cron
87484 root 20 0 59640 1568 1112 S 0.0 0.0 0:00.00 cron
1481 root 20 0 61364 1556 876 S 0.0 0.0 1:11.70 sshd
130365 root 20 0 65988 1468 840 S 0.0 0.0 0:00.01 sudo
869 root 20 0 49724 1320 616 S 0.0 0.0 0:00.04 systemd-udevd
79827 postfix 20 0 27624 1264 772 S 0.0 0.0 0:22.47 qmgr
Como você pode ver, existe uma diferença de ~ 10 GB entre a soma dos valores na coluna RSS (~ 18 GB) e o valor Used-menos-Cached, de acordo com free
. A memória em cache é consequentemente super restrita. Vale a pena notar que o java
processo lá é Cassandra e é super pesado.
Eu li esta resposta , que diz que o motivo mais comum para free
relatar mais memória usada top
é o top
fato de não incluir memória compartilhada em sua coluna RSS. Isso faz sentido e talvez possa explicar essa diferença nos relatórios. Mas fico imaginando que ferramentas me ajudarão a descobrir qual processo está consumindo toda a memória e corrigir esse problema.
Definitivamente, é um problema, porque eu tenho outro servidor que deveria estar fazendo um trabalho aproximadamente idêntico, e nesse servidor há uma diferença muito menor (~ 400 MB em comparação a ~ 10 GB) entre a soma dos valores RSS e Used-minus- Em cache, e a memória em cache nesse servidor é correspondentemente muito menos restrita.
Como posso descobrir o que está consumindo toda a memória?
Detalhes do sistema
uname
:
danslimmon@bad-server:~$ uname -a
Linux bad-server 3.13.0-39-generic #66-Ubuntu SMP Tue Oct 28 13:30:27 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
meminfo
:
danslimmon@bad-server:~$ cat /proc/meminfo
MemTotal: 30870568 kB
MemFree: 231428 kB
Buffers: 6048 kB
Cached: 3151268 kB
SwapCached: 0 kB
Active: 19606476 kB
Inactive: 1509568 kB
Active(anon): 17969604 kB
Inactive(anon): 576 kB
Active(file): 1636872 kB
Inactive(file): 1508992 kB
Unevictable: 8656 kB
Mlocked: 8656 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 3512 kB
Writeback: 0 kB
AnonPages: 17968288 kB
Mapped: 838648 kB
Shmem: 7116 kB
Slab: 195856 kB
SReclaimable: 113120 kB
SUnreclaim: 82736 kB
KernelStack: 9112 kB
PageTables: 55016 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 15435284 kB
Committed_AS: 17220572 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 71496 kB
VmallocChunk: 34359632104 kB
HardwareCorrupted: 0 kB
AnonHugePages: 10680320 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 59392 kB
DirectMap2M: 31528960 kB
lsmod
:
danslimmon@bad-server:~$ lsmod
Module Size Used by
tcp_diag 12591 0
inet_diag 18543 1 tcp_diag
dm_crypt 23177 0
syscopyarea 12529 0
sysfillrect 12701 0
sysimgblt 12640 0
fb_sys_fops 12703 0
serio_raw 13462 0
isofs 39837 0
raid10 48128 0
raid456 86484 0
async_memcpy 12762 1 raid456
async_raid6_recov 12984 1 raid456
async_pq 13365 1 raid456
async_xor 13160 2 async_pq,raid456
async_tx 13509 5 async_pq,raid456,async_xor,async_memcpy,async_raid6_recov
xor 21411 1 async_xor
raid6_pq 97812 2 async_pq,async_raid6_recov
raid0 17842 0
multipath 13145 0
linear 12894 0
raid1 35530 0
crct10dif_pclmul 14289 0
crc32_pclmul 13113 0
ghash_clmulni_intel 13216 0
aesni_intel 55624 0
aes_x86_64 17131 1 aesni_intel
lrw 13286 1 aesni_intel
gf128mul 14951 1 lrw
glue_helper 13990 1 aesni_intel
ablk_helper 13597 1 aesni_intel
cryptd 20359 3 ghash_clmulni_intel,aesni_intel,ablk_helper
psmouse 106714 0
floppy 69418 0
ixgbevf 50771 0
slabinfo
:
danslimmon@bad-server:~$ sudo cat /proc/slabinfo
slabinfo - version: 2.1
# name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
isofs_inode_cache 0 0 632 51 8 : tunables 0 0 0 : slabdata 0 0 0
UDPLITEv6 0 0 1088 30 8 : tunables 0 0 0 : slabdata 0 0 0
UDPv6 480 480 1088 30 8 : tunables 0 0 0 : slabdata 16 16 0
tw_sock_TCPv6 1024 1024 256 64 4 : tunables 0 0 0 : slabdata 16 16 0
TCPv6 256 256 1984 16 8 : tunables 0 0 0 : slabdata 16 16 0
kcopyd_job 0 0 3312 9 8 : tunables 0 0 0 : slabdata 0 0 0
dm_uevent 0 0 2608 12 8 : tunables 0 0 0 : slabdata 0 0 0
dm_rq_target_io 0 0 424 38 4 : tunables 0 0 0 : slabdata 0 0 0
cfq_queue 0 0 232 70 4 : tunables 0 0 0 : slabdata 0 0 0
bsg_cmd 0 0 312 52 4 : tunables 0 0 0 : slabdata 0 0 0
mqueue_inode_cache 36 36 896 36 8 : tunables 0 0 0 : slabdata 1 1 0
fuse_request 0 0 408 40 4 : tunables 0 0 0 : slabdata 0 0 0
fuse_inode 0 0 768 42 8 : tunables 0 0 0 : slabdata 0 0 0
ecryptfs_key_record_cache 0 0 576 56 8 : tunables 0 0 0 : slabdata 0 0 0
ecryptfs_inode_cache 0 0 1024 32 8 : tunables 0 0 0 : slabdata 0 0 0
fat_inode_cache 0 0 712 46 8 : tunables 0 0 0 : slabdata 0 0 0
fat_cache 0 0 40 102 1 : tunables 0 0 0 : slabdata 0 0 0
hugetlbfs_inode_cache 864 864 600 54 8 : tunables 0 0 0 : slabdata 16 16 0
jbd2_journal_handle 1360 1360 48 85 1 : tunables 0 0 0 : slabdata 16 16 0
jbd2_journal_head 6480 6480 112 36 1 : tunables 0 0 0 : slabdata 180 180 0
jbd2_revoke_table_s 512 512 16 256 1 : tunables 0 0 0 : slabdata 2 2 0
jbd2_revoke_record_s 16128 16768 32 128 1 : tunables 0 0 0 : slabdata 131 131 0
ext4_inode_cache 7115 18546 984 33 8 : tunables 0 0 0 : slabdata 562 562 0
ext4_xattr 0 0 88 46 1 : tunables 0 0 0 : slabdata 0 0 0
ext4_free_data 9408 9408 64 64 1 : tunables 0 0 0 : slabdata 147 147 0
ext4_allocation_context 6780 6780 136 60 2 : tunables 0 0 0 : slabdata 113 113 0
ext4_io_end 8848 8848 72 56 1 : tunables 0 0 0 : slabdata 158 158 0
ext4_extent_status 15380 32232 40 102 1 : tunables 0 0 0 : slabdata 316 316 0
dquot 1024 1024 256 64 4 : tunables 0 0 0 : slabdata 16 16 0
pid_namespace 0 0 2192 14 8 : tunables 0 0 0 : slabdata 0 0 0
user_namespace 0 0 264 62 4 : tunables 0 0 0 : slabdata 0 0 0
posix_timers_cache 0 0 248 66 4 : tunables 0 0 0 : slabdata 0 0 0
UDP-Lite 0 0 896 36 8 : tunables 0 0 0 : slabdata 0 0 0
xfrm_dst_cache 0 0 448 36 4 : tunables 0 0 0 : slabdata 0 0 0
ip_fib_trie 146 146 56 73 1 : tunables 0 0 0 : slabdata 2 2 0
UDP 576 576 896 36 8 : tunables 0 0 0 : slabdata 16 16 0
tw_sock_TCP 32835 47680 256 64 4 : tunables 0 0 0 : slabdata 745 745 0
TCP 1134 1134 1792 18 8 : tunables 0 0 0 : slabdata 63 63 0
blkdev_queue 84 84 2264 14 8 : tunables 0 0 0 : slabdata 6 6 0
blkdev_requests 2007 2142 384 42 4 : tunables 0 0 0 : slabdata 51 51 0
blkdev_ioc 780 780 104 39 1 : tunables 0 0 0 : slabdata 20 20 0
fsnotify_event 1156 1156 120 68 2 : tunables 0 0 0 : slabdata 17 17 0
sock_inode_cache 4991 5661 640 51 8 : tunables 0 0 0 : slabdata 111 111 0
shmem_inode_cache 1392 1392 672 48 8 : tunables 0 0 0 : slabdata 29 29 0
Acpi-ParseExt 15546 15624 72 56 1 : tunables 0 0 0 : slabdata 279 279 0
Acpi-State 306 306 80 51 1 : tunables 0 0 0 : slabdata 6 6 0
Acpi-Namespace 4182 4182 40 102 1 : tunables 0 0 0 : slabdata 41 41 0
taskstats 784 784 328 49 4 : tunables 0 0 0 : slabdata 16 16 0
proc_inode_cache 8535 10250 648 50 8 : tunables 0 0 0 : slabdata 205 205 0
sigqueue 816 816 160 51 2 : tunables 0 0 0 : slabdata 16 16 0
bdev_cache 468 468 832 39 8 : tunables 0 0 0 : slabdata 12 12 0
sysfs_dir_cache 27878 28584 112 36 1 : tunables 0 0 0 : slabdata 794 794 0
mnt_cache 357 357 320 51 4 : tunables 0 0 0 : slabdata 7 7 0
inode_cache 12096 12096 584 56 8 : tunables 0 0 0 : slabdata 216 216 0
dentry 30931 45864 192 42 2 : tunables 0 0 0 : slabdata 1092 1092 0
iint_cache 0 0 72 56 1 : tunables 0 0 0 : slabdata 0 0 0
buffer_head 519830 530088 104 39 1 : tunables 0 0 0 : slabdata 13592 13592 0
mm_struct 3153 3276 896 36 8 : tunables 0 0 0 : slabdata 91 91 0
files_cache 969 969 640 51 8 : tunables 0 0 0 : slabdata 19 19 0
signal_cache 2340 2340 1088 30 8 : tunables 0 0 0 : slabdata 78 78 0
sighand_cache 1230 1230 2112 15 8 : tunables 0 0 0 : slabdata 82 82 0
task_xstate 4212 4212 832 39 8 : tunables 0 0 0 : slabdata 108 108 0
task_struct 1227 1270 6144 5 8 : tunables 0 0 0 : slabdata 254 254 0
anon_vma 9665 12544 64 64 1 : tunables 0 0 0 : slabdata 196 196 0
shared_policy_node 16716 20995 48 85 1 : tunables 0 0 0 : slabdata 247 247 0
numa_policy 170 170 24 170 1 : tunables 0 0 0 : slabdata 1 1 0
radix_tree_node 43100 64581 568 57 8 : tunables 0 0 0 : slabdata 1133 1133 0
idr_layer_cache 390 390 2112 15 8 : tunables 0 0 0 : slabdata 26 26 0
dma-kmalloc-8192 0 0 8192 4 8 : tunables 0 0 0 : slabdata 0 0 0
dma-kmalloc-4096 0 0 4096 8 8 : tunables 0 0 0 : slabdata 0 0 0
dma-kmalloc-2048 0 0 2048 16 8 : tunables 0 0 0 : slabdata 0 0 0
dma-kmalloc-1024 0 0 1024 32 8 : tunables 0 0 0 : slabdata 0 0 0
dma-kmalloc-512 0 0 512 64 8 : tunables 0 0 0 : slabdata 0 0 0
dma-kmalloc-256 0 0 256 64 4 : tunables 0 0 0 : slabdata 0 0 0
dma-kmalloc-128 0 0 128 64 2 : tunables 0 0 0 : slabdata 0 0 0
dma-kmalloc-64 0 0 64 64 1 : tunables 0 0 0 : slabdata 0 0 0
dma-kmalloc-32 0 0 32 128 1 : tunables 0 0 0 : slabdata 0 0 0
dma-kmalloc-16 0 0 16 256 1 : tunables 0 0 0 : slabdata 0 0 0
dma-kmalloc-8 0 0 8 512 1 : tunables 0 0 0 : slabdata 0 0 0
dma-kmalloc-192 0 0 192 42 2 : tunables 0 0 0 : slabdata 0 0 0
dma-kmalloc-96 0 0 96 42 1 : tunables 0 0 0 : slabdata 0 0 0
kmalloc-8192 372 372 8192 4 8 : tunables 0 0 0 : slabdata 93 93 0
kmalloc-4096 295 328 4096 8 8 : tunables 0 0 0 : slabdata 41 41 0
kmalloc-2048 553 592 2048 16 8 : tunables 0 0 0 : slabdata 37 37 0
kmalloc-1024 2445 2496 1024 32 8 : tunables 0 0 0 : slabdata 78 78 0
kmalloc-512 25536 25984 512 64 8 : tunables 0 0 0 : slabdata 406 406 0
kmalloc-256 11307 12864 256 64 4 : tunables 0 0 0 : slabdata 201 201 0
kmalloc-192 15569 20202 192 42 2 : tunables 0 0 0 : slabdata 481 481 0
kmalloc-128 15904 25216 128 64 2 : tunables 0 0 0 : slabdata 394 394 0
kmalloc-96 9618 9618 96 42 1 : tunables 0 0 0 : slabdata 229 229 0
kmalloc-64 22131 49536 64 64 1 : tunables 0 0 0 : slabdata 774 774 0
kmalloc-32 24966 29824 32 128 1 : tunables 0 0 0 : slabdata 233 233 0
kmalloc-16 54975 57344 16 256 1 : tunables 0 0 0 : slabdata 224 224 0
kmalloc-8 10752 10752 8 512 1 : tunables 0 0 0 : slabdata 21 21 0
kmem_cache_node 256 256 64 64 1 : tunables 0 0 0 : slabdata 4 4 0
kmem_cache 256 256 256 64 4 : tunables 0 0 0 : slabdata 4 4 0
slabtop
:
danslimmon@bad-server:~$ sudo slabtop -o
Active / Total Objects (% used) : 939965 / 1147485 (81.9%)
Active / Total Slabs (% used) : 23919 / 23919 (100.0%)
Active / Total Caches (% used) : 63 / 96 (65.6%)
Active / Total Size (% used) : 174670.15K / 219106.31K (79.7%)
Minimum / Average / Maximum Object : 0.01K / 0.19K / 8.00K
OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
507702 423501 83% 0.10K 13018 39 52072K buffer_head
64581 46172 71% 0.55K 1133 57 36256K radix_tree_node
58624 58624 100% 0.02K 229 256 916K kmalloc-16
49152 22498 45% 0.06K 768 64 3072K kmalloc-64
46368 32154 69% 0.19K 1104 42 8832K dentry
44736 38006 84% 0.25K 699 64 11184K tw_sock_TCP
30090 18932 62% 0.04K 295 102 1180K ext4_extent_status
29824 22268 74% 0.03K 233 128 932K kmalloc-32
29448 28805 97% 0.11K 818 36 3272K sysfs_dir_cache
25664 25664 100% 0.50K 401 64 12832K kmalloc-512
25216 15466 61% 0.12K 394 64 3152K kmalloc-128
20995 18058 86% 0.05K 247 85 988K shared_policy_node
20160 15866 78% 0.19K 480 42 3840K kmalloc-192
20031 6286 31% 0.96K 607 33 19424K ext4_inode_cache
16768 16256 96% 0.03K 131 128 524K jbd2_revoke_record_s
15736 15736 100% 0.07K 281 56 1124K Acpi-ParseExt
12864 10179 79% 0.25K 201 64 3216K kmalloc-256
12224 11706 95% 0.06K 191 64 764K anon_vma
12096 12096 100% 0.57K 216 56 6912K inode_cache
10752 10752 100% 0.01K 21 512 84K kmalloc-8
dmesg
: Gist
Aqui estão os free
resultados e /proc/meminfo
em um servidor com o mesmo trabalho, mas sem esse sintoma de alocação ausente:
danslimmon@good-server:~$ free -m
total used free shared buffers cached
Mem: 30148 26946 3201 2 156 9907
-/+ buffers/cache: 16882 13265
Swap: 0 0 0
danslimmon@good-server:~$ cat /proc/meminfo
MemTotal: 30871560 kB
MemFree: 3239620 kB
Buffers: 160000 kB
Cached: 10157876 kB
SwapCached: 0 kB
Active: 16048988 kB
Inactive: 4190252 kB
Active(anon): 9944028 kB
Inactive(anon): 512 kB
Active(file): 6104960 kB
Inactive(file): 4189740 kB
Unevictable: 6737636 kB
Mlocked: 6737636 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 13548 kB
Writeback: 0 kB
AnonPages: 16660296 kB
Mapped: 1840700 kB
Shmem: 2752 kB
Slab: 380296 kB
SReclaimable: 295224 kB
SUnreclaim: 85072 kB
KernelStack: 12232 kB
PageTables: 48640 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 15435780 kB
Committed_AS: 16161132 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 71464 kB
VmallocChunk: 34359649788 kB
HardwareCorrupted: 0 kB
AnonHugePages: 14536704 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 49152 kB
DirectMap2M: 31539200 kB
danslimmon@good-server:~$ smem -tw
Area Used Cache Noncache
firmware/hardware 0 0 0
kernel image 0 0 0
kernel dynamic memory 8708520 8351840 356680
userspace memory 18329120 1743636 16585484
free memory 3833920 3833920 0
----------------------------------------------------------
30871560 13929396 16942164
fonte
proc/meminfo
servidor que não exibe esse comportamento de falta de consumo de memória e possui ainda mais memória alocada para páginas enormes anônimas./proc/xen/balloon
a sua falta de memória.Respostas:
Sua
smem -tw
saída mostra que seu kernel está consumindo mais de 9 GB de memória dinâmica:Portanto, alguns módulos do kernel estão consumindo muita memória. Os candidatos principais são blobs de código fechado, como driver de kernel da NVIDIA.
Você pode postar a saída de
lsmod
ecat /proc/slabinfo
?fonte
Aparentemente, você está ciente do problema de memória compartilhada ao interpretar os números de RSS, então não vou expandir isso.
Os sistemas Linux variam muito (por exemplo, pense em sistemas embarcados), mas acho que a lista de processos que você está obtendo do topo parece incompleta, com poucos processos do sistema. Você notará que possui um processo de inicialização com PID = 1 e nada mais até PID = 869. Acho que você descobrirá que muitos desses PIDs estão associados a processos do kernel ativo. É possível que algo esteja usando muita RAM. A resposta de @ shodanshok certamente indica esse caminho. Você veria esses IDs de processo em / proc se tiver um procfs montado.
Gostaria de saber se você tem um sistema de arquivos na memória com conteúdo substancial (por exemplo
/tmp
). Como édf -h
?Se você tiver um procfs montado em / proc, verá uma lista completa de IDs de processo. Também / proc / $ PID / maps lista as regiões da memória física que um processo mapeia em seu espaço de memória, embora a interpretação esteja longe de ser trivial.
fonte
Se você estiver usando o Apache Cassandra, poderia especificar a versão que está sendo usada?
O DataDog fornece uma boa ferramenta de monitoramento do Cassandra. (Fonte: https://www.datadoghq.com/blog/how-to-monitor-cassandra-performance-metrics/ )
Algumas outras listas de verificação para Cassandra são:
fonte
A Oracle recomenda desativar grandes páginas transparentes. Quando você o desativará, verá que o AnonHugePages será 0. Você pode ver uma boa resposta em: desativar páginas enormes transparentes
fonte
proc/meminfo
servidor que não exibe esse comportamento de falta de consumo de memória e possui ainda mais memória alocada para páginas enormes anônimas.ps aux
madvise
configuração em vez dedisable
. É claro que as pessoas tendem a copiar cegamente exemplos sem ter um único pensamento sobre isso.