Analyze Memory Usage

gperftools / libtcmalloc

http://goog-perftools.sourceforge.net/doc/tcmalloc.html
http://goog-perftools.sourceforge.net/doc/heap_profiler.html
https://groups.google.com/forum/#!topic/google-perftools/hnvkJLxG1_I

Main advantage is its bultin heap profiler, that can also be used for finding memory leaks.

Memory usage.

The manual warns of increased memory usage, since it uses a large array for managing memory

TCMalloc allocates approximately 6 MB of memory.
  • span array takes 4MB of space on 32bit machine
  • It would be easy to roll a specialized version that trades a little bit of speed for more space efficiency.

https://groups.google.com/forum/#!topic/google-perftools/xPQaNLHo_44

Trial and error test on devel feed resulted in OOM on some installations, Switching back to anrmalloc was needed.

Memory profiler

On the platform, modify the /etc/runit/dss/run then restart dss. The same applies to dsa

export LD_PRELOAD=/usr/lib/libtcmalloc.so.4
# tcmalloc memory profiling, killall -SIGUSR2 dss to create dump
# export HEAPPROFILE=/tmp/dss.hprof HEAPPROFILESIGNAL=12

Force profile creation

$ killall -12 dss

Copy the files to your host, then analyze. Source code is not required

. /opt/poky/1.3/environment-setup-armv7a-vfp-neon-poky-linux-gnueabi
export PPROF_TOOLS=objdump:$OBJDUMP,nm:$NM,addr2line:${CROSS_COMPILE}addr2line,c++filt:${CROSS_COMPILE}c++filt
google-pprof --lib_prefix=$OECORE_TARGET_SYSROOT /tmp/dsa /tmp/dsa.hprof.0001.heap --heapcheck --pdf >output.pdf

This will create a function call graph. Each function name is printed with a font size according its attribution to memory consumption. Offenders stick out. See heapprofile link above

CPU profiler

gperftools also contains a statistical performance profiler.

export LD_PRELOAD="/usr/lib/libprofiler.so" 
export CPUPROFILE=/tmp/dss.prof CPUPROFILESIGNAL=12

The signal is used to start/stop profiling, typical usage:

pkill -12 dss
for i in `seq 1000`; do
./tools/dss_json.sh /json/event/subscribe?subscriptionID=30....
done
pkill -12 dss

And evaluation

google-pprof -gv build-x86/build/dss /tmp/dss.prof.0

Troubleshooting
Evtl. dss crashes with SIGSEGV in libunwind, workaround is recompile google-perftools with frame pointer support

./configure --enable-frame-pointers

Valgrind/Massif

Collect data

. env.sh
VALGRIND=0

if [ "$1" ==  "--valgrind" ]; then
  echo "valgrind" 
  VALGRIND=1
fi

CMD="./build/dss --webroot=$PREFIX/share/dss-web/webroot" 
pushd dss-mainline/
if [ $VALGRIND -eq 0 ]; then
  $CMD
else
  exec valgrind --tool=massif $CMD
fi
popd

Make sure to disable tcmalloc

$ export CXXFLAGS="-g -O0" 
$ ./configure --prefix=$PREFIX --with-search=$PREFIX --enable-debug --enable-http --disable-soap --disable-libtcmalloc --disable-sendmail

Evaluate data

$ git clone git://anongit.kde.org/massif-visualizer

see checkinstall, if you want to create a package

$ massif-visualizer massif.out.8937 

Possible output and links

http://stackoverflow.com/questions/1623771/valgrind-massif-tool-output-graphical-interface
https://projects.kde.org/projects/extragear/sdk/massif-visualizer/repository

Summary

  • +No need to recompile existing binary
  • -Hard to run on embedded, e.g deploy with testing
    • incr. Memory footprint
    • Need to crosscompile whole suite
    • Perfomance penalty

Hence, tests were synthetic, since biggest installation was LegoLand

Not evaluated

Lexmark ANRmalloc

https://github.com/lxkiwatkins/anrmalloc
http://elinux.org/File:Elc2013-embedded-memory-management.pdf
https://lwn.net/Articles/531077/ -- callback if memory pressure is high

keywords anrmalloc lexmark

Similar in concept to tcmalloc, madvise(MADV_DONTNEED), but has no builtin heap profiler
Currently used in production.

Compare revisions

Prepare the dSS devshell in OE (prerequisite: choose your feed/setup as needed via setup-oe-core.sh/setup-yocto.sh):

$ bitbake -c cleanall dss
$ bitbake -c install dss

Enter devshell:

$ bitbake -c devshell dss

Build range of revisions that you are interested in:

#!/bin/sh
COPY_TO="/tmp/dss-revisions" 
mkdir -p $COPY_TO

COUNT=0
for rev in `git log --after "Fri Dec 19 17:40:31 2014 +0100" --merges | egrep -v "Merge commit" | grep commit | cut -d ' ' -f 2`
do
    make clean && git checkout $rev && ../temp/run.do_configure && ../temp/run.do_compile && ../temp/run.do_install && cp ../image/usr/bin/dss $COPY_TO/`printf "%02d" $COUNT`.dss.$rev && $STRIP ../image/usr/bin/dss && mv ../image/usr/bin/dss $COPY_TO/`printf "%02d" $COUNT`.dss.stripped.$rev
COUNT=`expr $COUNT + 1`
done

Copy binaries that you want to test to target, copy the measure script to target. Install procps using:

# opkg install procps

Fine tune samples and wait between samples values accordingly (variables on top in the script) and run the script:

# ./measure my-dir-with-dss-binaries/

Output logs will be written to /tmp/memory_log-*

Also, dss.log for each revision run will be backed up in /var/log/dss/

tcmalloc.svg (272 KB) Andreas Fenkart, 06/07/2013 11:00 AM

measure - script to compare memory usage of different dSS revisions (3.41 KB) Sergey Bostandzhyan, 02/17/2015 02:51 PM