Shawn Heisey

GC Tuning

Java GC tuning options that work well for me with Solr versions as new as 4.9.0 on 64-bit CentOS 6 with Oracle Java 7. I typically use a heap of 6GB or 7GB.

JVM_OPTS=" \
-XX:NewRatio=3 \
-XX:SurvivorRatio=4 \
-XX:TargetSurvivorRatio=90 \
-XX:MaxTenuringThreshold=8 \
-XX:+UseConcMarkSweepGC \
-XX:+CMSScavengeBeforeRemark \
-XX:PretenureSizeThreshold=64m \
-XX:CMSFullGCsBeforeCompaction=1 \
-XX:+UseCMSInitiatingOccupancyOnly \
-XX:CMSInitiatingOccupancyFraction=70 \
-XX:CMSTriggerPermRatio=80 \
-XX:CMSMaxAbortablePrecleanTime=6000 \
-XX:+CMSParallelRemarkEnabled
-XX:+ParallelRefProcEnabled
-XX:+UseLargePages \
-XX:+AggressiveOpts \
"

Here's the options I used to use:

-XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=75
-XX:NewRatio=3
-XX:MaxTenuringThreshold=8
-XX:+CMSParallelRemarkEnabled
-XX:+ParallelRefProcEnabled
-XX:+UseLargePages
-XX:+AggressiveOpts

A note about the UseLargePages option: This option will not actually do anything unless you allocate memory to Huge Pages in your operating system. If you do so, memory usage reporting with the "top" command will probably only show a few hundred MB of resident memory used by your Solr process, even if it is in fact using several gigabytes of heap.

I got a recommendation on IRC for a simpler alternative. One day when I find the time, I will do some more extensive testing.

<@gbowyer> try these on for size -server -Xmx8g -Xms8g -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=80

Here's another set of parameters that I got via IRC, which I used as inspiration for my current parameters. The user that provided these has good performance with a 48GB heap:

    SOLR_JVM_OPTS="-Djetty.port=8983 \
    -server \
    -Xloggc:/var/log/solr/causearch-gc.log \
    -XX:+PrintGCTimeStamps \
    -XX:+PrintGCDetails \
    -XX:MaxPermSize=128m \
    -Xms48009m -Xmx48009m \
    -XX:NewSize=1024m \
    -XX:SurvivorRatio=1 \
    -XX:TargetSurvivorRatio=90 \
    -XX:MaxTenuringThreshold=8 \
    -XX:+UseConcMarkSweepGC \
    -XX:+CMSScavengeBeforeRemark \
    -XX:PretenureSizeThreshold=512m \
    -XX:CMSFullGCsBeforeCompaction=1 \
    -XX:+UseCMSInitiatingOccupancyOnly \
    -XX:CMSInitiatingOccupancyFraction=70 \
    -XX:CMSTriggerPermRatio=80 \
    -XX:CMSMaxAbortablePrecleanTime=6000 \
    -XX:+CMSConcurrentMTEnabled \
    -XX:+UseParNewGC \
    -XX:ConcGCThreads=7 \
    -XX:ParallelGCThreads=7 \
    -XX:+UseLargePages \
    "

< skopii> when i was testing I wrote a comment #some
          newsize/survivor ratios that work well: 16G/9@60rps,
          8G/6@30rps, 4G/3@15rps, 1G/1@2RPS(idle)

Yet another set:

13:25 < AaronD> elyograg: I can't vouch too hard for these (and some may have
                even become default-on at this point), but what I'm currently
                using for our 3.6.1 solrs on large-mem/large-heap boxes:
                -XX:+AggressiveOpts -XX:+HeapDumpOnOutOfMemoryError
                -XX:+OptimizeStringConcat -XX:+UseFastAccessorMethods
                -XX:+UseG1GC -XX:+UseStringCache -XX:-UseSplitVerifier
                -XX:MaxGCPauseMillis=50

Init script

This is a redhat-specific init script for a custom installation created from the Solr example:

#
# chkconfig: - 80 45
# description: Starts and stops Solr

# Source redhat init.d function library.
. /etc/rc.d/init.d/functions

# options that are commonly overridden
SOLRROOT=/opt/solr4
SOLRHOME=/index/solr4
LISTENPORT=8983
JMINMEM=256M
JMAXMEM=2048M
JBIN=/usr/bin/java
FUSER=/sbin/fuser
STOPKEY=mystopkey
STOPPORT=8078
JMXPORT=8686
STARTUSER=solr

# less commonly overridden options
PROC_CHECK_ARG="start\.jar"
LOG_OPTS="-Dlog4j.configuration=file:etc/log4j.properties"
JVM_OPTS="-XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:NewRatio=3 -XX:MaxTenuringThreshold=8 -XX:+CMSParallelRemarkEnabled -XX:+ParallelRefProcEnabled -XX:+UseLargePages -XX:+AggressiveOpts"
GCLOG_OPTS="-verbose:gc -Xloggc:logs/gc.log -XX:+PrintGCDateStamps -XX:+PrintGCDetails"
JMX_OPTS="-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=${JMXPORT} -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false"
SYSCONFIG=/etc/sysconfig/solr4

# Source sysconfig for this app. This is used to override options.
if [ -f ${SYSCONFIG} ];
then
  . ${SYSCONFIG}
fi

# Build required env variables.
STARTARGS="-Xms${JMINMEM} -Xmx${JMAXMEM} ${LOG_OPTS} ${JVM_OPTS} ${GCLOG_OPTS} ${JMX_OPTS} -Dsolr.solr.home=${SOLRHOME} -Djetty.port=${LISTENPORT} -DSTOP.PORT=${STOPPORT} -DSTOP.KEY=${STOPKEY}"
STOPARGS="-DSTOP.PORT=${STOPPORT} -DSTOP.KEY=${STOPKEY}"
PIDFILE=${SOLRROOT}/run/jetty.pid

# Grab UID of start user.  Abort loudly if it doesn't exist.
STARTUID=`id -u ${STARTUSER} 2> /dev/null`
RET=$?
if [ ${RET} -ne 0 ];
then
        echo "User ${STARTUSER} does not exist."
        exit 1
fi

# If we're root, restart as unpriveleged user. Abort if anyone else.
if [ ${UID} -eq 0 ];
then
  exec su ${STARTUSER} -c "$0 $*"
elif [ ${UID} -eq ${STARTUID} ];
then
  echo "do nothing" > /dev/null 2> /dev/null
else
  echo "This needs to be run as root or ${STARTUSER}. "
  exit 1
fi

runexit() {
  echo "Already running!"
  exit 1
}

start() {
  echo -n "Starting Solr... "
  if [ "x${PID}" == "x" ];
  then
    PID=0
  fi
  # Let's check to see if the PID is up and is actually Solr
  checkproccmdline > /dev/null 2> /dev/null
  if [ ${RET} -eq 0 ];
  then
    # If PID is up and is Solr, fail.
    runexit
  fi
  # Pre-emptive extreme prejudice strike on anything using Solr's port
  killbyport
  sleep 1
  # Start 'er up.
  cd ${SOLRROOT}
  ${JBIN} ${STARTARGS} -jar ${SOLRROOT}/start.jar >logs/out 2>logs/err &
  PID=$!
  echo ${PID} > ${PIDFILE}
  disown ${PID}
  echo "done"
}

checkproccmdline() {
  grep ${PROC_CHECK_ARG} /proc/${PID}/cmdline > /dev/null 2> /dev/null
  RET=$?
}

termbypid() {
  checkpid ${PID}
  RET=$?
  if [ ${RET} -eq 0 ];
  then
    checkproccmdline
    if [ ${RET} -eq 0 ];
    then
      kill $PID
    else
      echo
      echo "pid ${PID} is not Solr!"
      echo > ${PIDFILE}
    fi
  fi
}

killbypid() {
  checkpid ${PID}
  RET=$?
  if [ ${RET} -eq 0 ];
  then
    checkproccmdline
    if [ ${RET} -eq 0 ];
    then
      kill -9 $PID
    else
      echo
      echo "pid ${PID} is not Solr!"
      echo > ${PIDFILE}
    fi
  fi
}

termbyport() {
  ${FUSER} -sk -n tcp ${LISTENPORT}
}

killbyport() {
  ${FUSER} -sk9 -n tcp ${LISTENPORT}
}

checkpidsleep() {
  checkpid ${PID} && sleep 1
  checkpid ${PID} && sleep 1
  checkpid ${PID} && sleep 1
  checkpid ${PID} && sleep 1
  checkpid ${PID} && sleep 1
  checkpid ${PID} && sleep 1
  checkpid ${PID} && sleep 1
  checkpid ${PID} && sleep 1
  checkpid ${PID} && sleep 1
  checkpid ${PID} && sleep 1
}

stop() {
  echo -n "Stopping Solr... "
  # check to see if PID is empty.
  if [ "x${PID}" == "x" ];
  then
    # term then kill anything using Solr's port, assume success.
    termbyport
    sleep 5
    killbyport
  else
    # Try Jetty's stop mechanism.
    ${JBIN} $STOPARGS -jar /${SOLRROOT}/start.jar --stop \
      > /dev/null 2> /dev/null
    checkpidsleep
    # Controlled death by PID.
    termbypid
    checkpidsleep
    # Controlled death by Solr port.
    termbyport
    checkpidsleep
    # Sudden death by PID and Solr port.
    killbyport
    killbypid
    checkpidsleep
    # report failure if none of the above worked.
    if checkpid $PID 2>&1;
    then
      echo "failed!"
      exit 1
    fi
  fi
  echo "done"
}

UP=`cat /proc/uptime | cut -f1 -d\ | cut -f1 -d.`
# If the machine has been up less than 3 minutes, wipe the pidfile.
if [ ${UP} -lt 180 ];
then
  echo > ${PIDFILE}
fi

PID=`cat ${PIDFILE}`

case "$1" in
  start)
    start
    ;;
  stop)
    stop
    ;;
  restart)
    stop
    start
    ;;
  *)
    echo $"Usage: $0 {start|stop|restart}"
    exit 1
esac

exit $?

For my Solr install on CentOS 6, I use /opt/solr4 as my installation path, and /index/solr4 as my solr home. The /index directory is a dedicated filesystem, /opt is part of the root filesystem.

From the example directory, I copied cloud-scripts, contexts, etc, lib, webapps, and start.jar over to /opt/solr4. My stuff was created before 4.3.0, so the resources directory didn't exist. I was already using log4j with a custom Solr build, and I put my log4j.properties file in etc instead. I created a logs directory and a run directory in /opt/solr4.

My directory structure in the solr home, (/index/solr4) is more complex than most. What a new user really needs to know is that solr.xml goes here and dictates the rest of the structure. I have a symlink at /index/solr4/lib, pointing to /opt/solr4/solrlib - so that jars placed in ${solr.solr.home}/lib are actually located in the program directory, not the data directory. That makes for a much cleaner version control scenario - both directories are git repositories cloned from our internal git server.

Unlike the example configs, my solrconfig.xml files do not have <lib> directives for loading jars. That gets automatically handled by the jars living in that symlinked lib directory. See SOLR-4852 for caveats regarding central lib directories.

If you want to run SolrCloud, you would need to install zookeeper separately and put your zkHost parameter in solr.xml. Due to a bug, putting zkHost in solr.xml doesn't work properly until 4.4.0.


CategoryHomepage

ShawnHeisey (last edited 2014-08-12 22:41:06 by ShawnHeisey)