| <?xml version="1.0" encoding="UTF-8"?> |
| <!DOCTYPE refentry PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" |
| "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd"> |
| <refentry id="re-drbdconf"> |
| <refentryinfo> |
| <date>6 May 2011</date> |
| |
| <productname>DRBD</productname> |
| |
| <productnumber>8.4.0</productnumber> |
| </refentryinfo> |
| |
| <refmeta> |
| <refentrytitle>drbd.conf</refentrytitle> |
| |
| <manvolnum>5</manvolnum> |
| |
| <refmiscinfo class="manual">Configuration Files</refmiscinfo> |
| </refmeta> |
| |
| <refnamediv> |
| <refname>drbd.conf</refname> |
| |
| <refpurpose>Configuration file for DRBD's devices <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| </indexterm></refpurpose> |
| </refnamediv> |
| |
| <refsect1> |
| <title>Introduction</title> |
| |
| <para>The file <option>/etc/drbd.conf</option> is read by <option>drbdadm</option>.</para> |
| |
| <para>The file format was designed as to allow to have a verbatim copy of the file on both |
| nodes of the cluster. It is highly recommended to do so in order to keep your configuration |
| manageable. The file <option>/etc/drbd.conf</option> should be the same on both nodes of the |
| cluster. Changes to <option>/etc/drbd.conf</option> do not apply immediately.</para> |
| |
| <para>By convention the main config contains two include statements. The first one includes |
| the file <option>/etc/drbd.d/global_common.conf</option>, the second one all file with a |
| <option>.res</option> suffix.</para> |
| |
| <para><example> |
| <title>A small example.res file</title> |
| |
| <programlisting format="linespecific">resource r0 { |
| net { |
| protocol C; |
| cram-hmac-alg sha1; |
| shared-secret "FooFunFactory"; |
| } |
| disk { |
| resync-rate 10M; |
| } |
| on alice { |
| volume 0 { |
| device minor 1; |
| disk /dev/sda7; |
| meta-disk internal; |
| } |
| address 10.1.1.31:7789; |
| } |
| on bob { |
| volume 0 { |
| device minor 1; |
| disk /dev/sda7; |
| meta-disk internal; |
| } |
| address 10.1.1.32:7789; |
| } |
| }</programlisting> |
| </example>In this example, there is a single DRBD resource (called r0) which uses protocol C |
| for the connection between its devices. It contains a single volume which runs on host |
| <replaceable>alice</replaceable> uses <replaceable>/dev/drbd1</replaceable> as devices for its |
| application, and <replaceable>/dev/sda7</replaceable> as low-level storage for the data. The |
| IP addresses are used to specify the networking interfaces to be used. An eventually running |
| resync process should use about 10MByte/second of IO bandwidth. This sync-rate statement is |
| valid for volume 0, but would also be valid for further volumes. In this example it assigns |
| full 10MByte/second to each volume.</para> |
| |
| <para>There may be multiple resource sections in a single drbd.conf file. For more examples, |
| please have a look at the |
| <ulink url="http://www.drbd.org/users-guide/"><citetitle>DRBD User's Guide</citetitle></ulink>.</para> |
| |
| </refsect1> |
| |
| <refsect1> |
| <title>File Format</title> |
| |
| <para>The file consists of sections and parameters. A section begins with a keyword, sometimes |
| an additional name, and an opening brace (<quote>{</quote>). A section ends with a closing |
| brace (<quote>}</quote>. The braces enclose the parameters.</para> |
| |
| <para>section [name] { parameter value; [...] }</para> |
| |
| <para>A parameter starts with the identifier of the parameter followed by whitespace. Every |
| subsequent character is considered as part of the parameter's value. A special case are |
| Boolean parameters which consist only of the identifier. Parameters are terminated by a |
| semicolon (<quote>;</quote>).</para> |
| |
| <para>Some parameter values have default units which might be overruled by K, M or G. These |
| units are defined in the usual way (K = 2^10 = 1024, M = 1024 K, G = 1024 M).</para> |
| |
| <para>Comments may be placed into the configuration file and must begin with a hash sign |
| (<quote>#</quote>). Subsequent characters are ignored until the end of the line.</para> |
| |
| <refsect2> |
| <title>Sections</title> |
| |
| <variablelist> |
| <varlistentry> |
| <term xml:id="skip"><option>skip</option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>skip</secondary> |
| </indexterm> Comments out chunks of text, even spanning more than one line. |
| Characters between the keyword <option>skip</option> and the opening brace |
| (<quote>{</quote>) are ignored. Everything enclosed by the braces is skipped. This |
| comes in handy, if you just want to comment out some '<option>resource [name] |
| {...}</option>' section: just precede it with '<option>skip</option>'.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="global"><option>global</option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>global</secondary> |
| </indexterm> Configures some global parameters. Currently only |
| <option>minor-count</option>, <option>dialog-refresh</option>, |
| <option>disable-ip-verification</option> and <option>usage-count</option> are allowed |
| here. You may only have one global section, preferably as the first section.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="common"><option>common</option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>common</secondary> |
| </indexterm> All resources inherit the options set in this section. The common |
| section might have a <option>startup</option>, a <option>options</option>, a |
| <option>handlers</option>, a <option>net</option> and a <option>disk</option> |
| section.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="resource"><option>resource <replaceable>name</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>resource</secondary> |
| </indexterm> Configures a DRBD resource. Each resource section needs to have two (or |
| more) <option>on <replaceable>host</replaceable></option> sections and may have a |
| <option>startup</option>, a <option>options</option>, a <option>handlers</option>, a |
| <option>net</option> and a <option>disk</option> section. It might contain |
| <option>volume</option>s sections.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="on"><option>on <replaceable>host-name</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>on</secondary> |
| </indexterm> Carries the necessary configuration parameters for a DRBD device of the |
| enclosing resource. <replaceable>host-name</replaceable> is mandatory and must match |
| the Linux host name (uname -n) of one of the nodes. You may list more than one host |
| name here, in case you want to use the same parameters on several hosts (you'd have to |
| move the IP around usually). Or you may list more than two such sections. |
| <programlisting format="linespecific"> resource r1 { |
| protocol C; |
| device minor 1; |
| meta-disk internal; |
| |
| on alice bob { |
| address 10.2.2.100:7801; |
| disk /dev/mapper/some-san; |
| } |
| on charlie { |
| address 10.2.2.101:7801; |
| disk /dev/mapper/other-san; |
| } |
| on daisy { |
| address 10.2.2.103:7801; |
| disk /dev/mapper/other-san-as-seen-from-daisy; |
| } |
| } |
| </programlisting>See also the <option>floating</option> section keyword. Required statements in |
| this section: <option>address</option> and <option>volume</option>. Note for backward |
| compatibility and convenience it is valid to embed the statements of a single volume |
| directly into the host section.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="volume"><option>volume <replaceable>vnr</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>volume</secondary> |
| </indexterm> Defines a volume within a connection. The minor numbers of a replicated |
| volume might be different on different hosts, the volume number |
| (<replaceable>vnr</replaceable>) is what groups them together. Required parameters in |
| this section: <option>device</option>, <option>disk</option>, |
| <option>meta-disk</option>.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="stacked-on-top-of"><option>stacked-on-top-of <replaceable>resource</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>stacked-on-top-of</secondary> |
| </indexterm> For a stacked DRBD setup (3 or 4 nodes), a |
| <option>stacked-on-top-of</option> is used instead of an <option>on</option> section. |
| Required parameters in this section: <option>device</option> and |
| <option>address</option>.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="floating"><option>floating <replaceable>AF addr:port</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>on</secondary> |
| </indexterm> Carries the necessary configuration parameters for a DRBD device of the |
| enclosing resource. This section is very similar to the <option>on</option> section. |
| The difference to the <option>on</option> section is that the matching of the host |
| sections to machines is done by the IP-address instead of the node name. Required |
| parameters in this section: <option>device</option>, <option>disk</option>, |
| <option>meta-disk</option>, all of which |
| <emphasis>may</emphasis> be inherited from the resource section, in which case you may |
| shorten this section down to just the address identifier. <programlisting |
| format="linespecific"> resource r2 { |
| protocol C; |
| device minor 2; |
| disk /dev/sda7; |
| meta-disk internal; |
| |
| # short form, device, disk and meta-disk inherited |
| floating 10.1.1.31:7802; |
| |
| # longer form, only device inherited |
| floating 10.1.1.32:7802 { |
| disk /dev/sdb; |
| meta-disk /dev/sdc8; |
| } |
| } |
| </programlisting></para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="s-disk"><option>disk</option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>disk</secondary> |
| </indexterm> This section is used to fine tune DRBD's properties in respect to the |
| low level storage. Please refer to <citerefentry> |
| <refentrytitle>drbdsetup</refentrytitle> |
| |
| <manvolnum>8</manvolnum> |
| </citerefentry> for detailed description of the parameters. Optional parameters: |
| <option>on-io-error</option>, <option>size</option>, <option>fencing</option>, |
| <option>disk-barrier</option>, <option>disk-flushes</option>, |
| <option>disk-drain</option>, <option>md-flushes</option>, |
| <option>max-bio-bvecs</option>, <option>resync-rate</option>, |
| <option>resync-after</option>, <option>al-extents</option>, <option>al-updates</option>, |
| <option>c-plan-ahead</option>, <option>c-fill-target</option>, |
| <option>c-delay-target</option>, <option>c-max-rate</option>, |
| <option>c-min-rate</option>, <option>disk-timeout</option>, |
| <option>discard-zeroes-if-aligned</option>, |
| <option>rs-discard-granularity</option>, |
| <option>read-balancing</option>.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="net"><option>net</option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>net</secondary> |
| </indexterm> This section is used to fine tune DRBD's properties. Please refer to |
| <citerefentry> |
| <refentrytitle>drbdsetup</refentrytitle> |
| |
| <manvolnum>8</manvolnum> |
| </citerefentry> for a detailed description of this section's parameters. Optional |
| parameters: <option>protocol</option>, <option>sndbuf-size</option>, |
| <option>rcvbuf-size</option>, <option>timeout</option>, <option>connect-int</option>, |
| <option>ping-int</option>, <option>ping-timeout</option>, |
| <option>max-buffers</option>, <option>max-epoch-size</option>, |
| <option>ko-count</option>, <option>allow-two-primaries</option>, |
| <option>cram-hmac-alg</option>, <option>shared-secret</option>, |
| <option>after-sb-0pri</option>, <option>after-sb-1pri</option>, |
| <option>after-sb-2pri</option>, <option>data-integrity-alg</option>, |
| <option>no-tcp-cork</option>, <option>on-congestion</option>, |
| <option>congestion-fill</option>, <option>congestion-extents</option>, |
| <option>verify-alg</option>, <option>use-rle</option>, |
| <option>csums-alg</option>, |
| <option>socket-check-timeout</option>.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="startup"><option>startup</option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>startup</secondary> |
| </indexterm> This section is used to fine tune DRBD's properties. Please refer to |
| <citerefentry> |
| <refentrytitle>drbdsetup</refentrytitle> |
| |
| <manvolnum>8</manvolnum> |
| </citerefentry> for a detailed description of this section's parameters. Optional |
| parameters: <option>wfc-timeout</option>, <option>degr-wfc-timeout</option>, |
| <option>outdated-wfc-timeout</option>, <option>wait-after-sb</option>, |
| <option>stacked-timeouts</option> and <option>become-primary-on</option>.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="options"><option>options</option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>options</secondary> |
| </indexterm> This section is used to fine tune the behaviour of the resource object. |
| Please refer to <citerefentry> |
| <refentrytitle>drbdsetup</refentrytitle> |
| |
| <manvolnum>8</manvolnum> |
| </citerefentry> for a detailed description of this section's parameters. Optional |
| parameters: <option>cpu-mask</option>, and |
| <option>on-no-data-accessible</option>.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="handlers"><option>handlers</option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>handlers</secondary> |
| </indexterm> In this section you can define handlers (executables) that are started |
| by the DRBD system in response to certain events. Optional parameters: |
| <option>pri-on-incon-degr</option>, <option>pri-lost-after-sb</option>, |
| <option>pri-lost</option>, <option>fence-peer</option> (formerly oudate-peer), |
| <option>local-io-error</option>, <option>initial-split-brain</option>, |
| <option>split-brain</option>, <option>before-resync-target</option>, |
| <option>after-resync-target</option>.</para> |
| |
| <para>The interface is done via environment variables:<itemizedlist> |
| <listitem> |
| <para><option>DRBD_RESOURCE</option> is the name of the resource</para> |
| </listitem> |
| |
| <listitem> |
| <para><option>DRBD_MINOR</option> is the minor number of the DRBD device, in |
| decimal.</para> |
| </listitem> |
| |
| <listitem> |
| <para><option>DRBD_CONF</option> is the path to the primary configuration file; |
| if you split your configuration into multiple files (e.g. in |
| <option>/etc/drbd.conf.d/</option>), this will not be helpful.</para> |
| </listitem> |
| |
| <listitem> |
| <para><option>DRBD_PEER_AF</option> , <option>DRBD_PEER_ADDRESS</option> , |
| <option>DRBD_PEERS</option> are the address family (e.g. <option>ipv6</option>), |
| the peer's address and hostnames.</para> |
| </listitem> |
| </itemizedlist> <option>DRBD_PEER</option> is deprecated.</para> |
| |
| <para>Please note that not all of these might be set for all handlers, and that some |
| values might not be useable for a <option>floating</option> definition.</para> |
| </listitem> |
| </varlistentry> |
| </variablelist> |
| </refsect2> |
| |
| <refsect2> |
| <title>Parameters</title> |
| |
| <variablelist> |
| <varlistentry> |
| <term xml:id="minor-count"><option>minor-count <replaceable>count</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>minor-count</secondary> |
| </indexterm><replaceable>count</replaceable> may be a number from 1 to 1048575.</para> |
| |
| <para><replaceable>Minor-count</replaceable> is a sizing hint for DRBD. It helps to |
| right-size various memory pools. It should be set in the in the same order of |
| magnitude than the actual number of minors you use. Per default the module loads with |
| 11 more resources than you have currently in your config but at least 32.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="dialog-refresh"><option>dialog-refresh <replaceable>time</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>dialog-refresh</secondary> |
| </indexterm><replaceable>time</replaceable> may be 0 or a positive number.</para> |
| |
| <para>The user dialog redraws the second count every <replaceable>time</replaceable> |
| seconds (or does no redraws if <replaceable>time</replaceable> is 0). The default |
| value is 1.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="disable-ip-verification"><option>disable-ip-verification</option></term> |
| |
| <listitem> |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>disable-ip-verification</secondary> |
| </indexterm> |
| |
| <para>Use <replaceable>disable-ip-verification</replaceable> if, for some obscure |
| reasons, drbdadm can/might not use <option>ip</option> or <option>ifconfig</option> to |
| do a sanity check for the IP address. You can disable the IP verification with this |
| option.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="udev-always-use-vnr"><option>udev-always-use-vnr</option></term> |
| |
| <listitem> |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>udev-always-use-vnr</secondary> |
| </indexterm> |
| |
| <para>When udev asks drbdadm for a list of device related symlinks, |
| drbdadm would suggest symlinks with differing naming conventions, |
| depending on whether the resource has explicit |
| <literal>volume VNR { }</literal> definitions, |
| or only one single volume with the implicit volume number 0: |
| <programlisting><![CDATA[ |
| # implicit single volume without "volume 0 {}" block |
| DEVICE=drbd<minor> |
| SYMLINK_BY_RES=drbd/by-res/<resource-name> |
| |
| # explicit volume definition: volume VNR { } |
| DEVICE=drbd<minor> |
| SYMLINK_BY_RES=drbd/by-res/<resource-name>/VNR |
| ]]></programlisting> |
| </para> |
| |
| <para>If you define this parameter in the global section, |
| drbdadm will always add the <literal>.../VNR</literal> part, |
| and will not care for whether the volume definition was implicit or explicit. |
| </para> |
| |
| <para>For legacy backward compatibility, this is off by default, |
| but we do recommend to enable it.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="usage-count"><option>usage-count <replaceable>val</replaceable></option></term> |
| |
| <listitem> |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>usage-count</secondary> |
| </indexterm> |
| |
| <para>Please participate in |
| <ulink url="http://usage.drbd.org"><citetitle>DRBD's online usage counter</citetitle></ulink>. |
| The most convenient way to do so is to set |
| this option to <option>yes</option>. Valid options are: <option>yes</option>, |
| <option>no</option> and <option>ask</option>.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="protocol"><option>protocol <replaceable>prot-id</replaceable></option></term> |
| |
| <listitem> |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>protocol</secondary> |
| </indexterm> |
| |
| <para>On the TCP/IP link the specified <replaceable>protocol</replaceable> is used. |
| Valid protocol specifiers are A, B, and C.</para> |
| |
| <para>Protocol A: write IO is reported as completed, if it has reached local disk and |
| local TCP send buffer.</para> |
| |
| <para>Protocol B: write IO is reported as completed, if it has reached local disk and |
| remote buffer cache.</para> |
| |
| <para>Protocol C: write IO is reported as completed, if it has reached both local and |
| remote disk.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="device"><option>device <replaceable>name</replaceable> minor |
| <replaceable>nr</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>device</secondary> |
| </indexterm> The name of the block device node of the resource being described. You |
| must use this device with your application (file system) and you must not use the low |
| level block device which is specified with the <option>disk</option> parameter.</para> |
| |
| <para>One can ether omit the <replaceable>name</replaceable> or <option>minor</option> |
| and the <replaceable>minor number</replaceable>. If you omit the |
| <replaceable>name</replaceable> a default of /dev/drbd<replaceable>minor</replaceable> |
| will be used.</para> |
| |
| <para>Udev will create additional symlinks in /dev/drbd/by-res and |
| /dev/drbd/by-disk.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="disk"><option>disk <replaceable>name</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>disk</secondary> |
| </indexterm> DRBD uses this block device to actually store and retrieve the data. |
| Never access such a device while DRBD is running on top of it. This also holds true |
| for <citerefentry> |
| <refentrytitle>dumpe2fs</refentrytitle> |
| |
| <manvolnum>8</manvolnum> |
| </citerefentry> and similar commands.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="address"><option>address <replaceable>AF addr:port</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>address</secondary> |
| </indexterm> A resource needs one <replaceable>IP</replaceable> address per device, |
| which is used to wait for incoming connections from the partner device respectively to |
| reach the partner device. <replaceable>AF</replaceable> must be one of |
| <option>ipv4</option>, <option>ipv6</option>, <option>ssocks</option> or |
| <option>sdp</option> (for compatibility reasons <option>sci</option> is an alias for |
| <option>ssocks</option>). It may be omited for IPv4 addresses. The actual IPv6 address |
| that follows the <option>ipv6</option> keyword must be placed inside brackets: |
| <literal moreinfo="none">ipv6 [fd01:2345:6789:abcd::1]:7800</literal>.</para> |
| |
| <para>Each DRBD resource needs a TCP <replaceable>port</replaceable> which is used to |
| connect to the node's partner device. Two different DRBD resources may not use the |
| same <replaceable>addr:port</replaceable> combination on the same node.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="meta-disk"><option>meta-disk internal</option></term> |
| |
| <term><option>meta-disk <replaceable>device</replaceable></option></term> |
| |
| <term><option>meta-disk <replaceable>device</replaceable> [<replaceable>index</replaceable>]</option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>meta-disk</secondary> |
| </indexterm> Internal means that the last part of the backing device is used to |
| store the meta-data. The size of the meta-data is computed based on the size of the |
| device.</para> |
| |
| <para>When a <replaceable>device</replaceable> is specified, either with or without an |
| <replaceable>index</replaceable>, DRBD stores the meta-data on this device. Without |
| <replaceable>index</replaceable>, the size of the meta-data is determined by the size |
| of the data device. This is usually used with LVM, which allows to have many variable |
| sized block devices. The meta-data size is 36kB + Backing-Storage-size / 32k, rounded up |
| to the next 4kb boundary. (Rule of the thumb: 32kByte per 1GByte of storage, rounded up |
| to the next MB.)</para> |
| |
| <para>When an <replaceable>index</replaceable> is specified, each index number refers to |
| a fixed slot of meta-data of 128 MB, which allows a maximum data size of 4 TiB. This way, |
| multiple DBRD devices can share the same meta-data device. For example, if /dev/sde6[0] |
| and /dev/sde6[1] are used, /dev/sde6 must be at least 256 MB big. Because of the hard size |
| limit, use of meta-disk indexes is discouraged.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="on-io-error"><option>on-io-error <replaceable>handler</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>on-io-error</secondary> |
| </indexterm><replaceable>handler</replaceable> is taken, if the lower level device |
| reports io-errors to the upper layers.</para> |
| |
| <para><replaceable>handler</replaceable> may be <option>pass_on</option>, |
| <option>call-local-io-error</option> or <option>detach.</option></para> |
| |
| <para><option>pass_on</option>: The node downgrades the disk status to inconsistent, marks the |
| erroneous block as inconsistent in the bitmap and retries the IO on the remote node.</para> |
| |
| <para><option>call-local-io-error</option>: Call the handler script |
| <option>local-io-error</option>.</para> |
| |
| <para><option>detach</option>: The node drops its low level device, and continues in |
| diskless mode.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="fencing"><option>fencing <replaceable>fencing_policy</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>fencing</secondary> |
| </indexterm> By <option>fencing</option> we understand preventive measures to avoid |
| situations where both nodes are primary and disconnected (AKA split brain).</para> |
| |
| <para>Valid fencing policies are:</para> |
| |
| <variablelist> |
| <varlistentry> |
| <term xml:id="dont-care"><option>dont-care</option></term> |
| |
| <listitem> |
| <para>This is the default policy. No fencing actions are taken.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="resource-only"><option>resource-only</option></term> |
| |
| <listitem> |
| <para>If a node becomes a disconnected primary, it tries to fence the peer's |
| disk. This is done by calling the <option>fence-peer</option> handler. The |
| handler is supposed to reach the other node over alternative communication paths |
| and call '<option>drbdadm outdate res</option>' there.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="resource-and-stonith"><option>resource-and-stonith</option></term> |
| |
| <listitem> |
| <para>If a node becomes a disconnected primary, it freezes all its IO operations |
| and calls its fence-peer handler. The fence-peer handler is supposed to reach |
| the peer over alternative communication paths and call 'drbdadm outdate res' |
| there. In case it cannot reach the peer it should stonith the peer. IO is |
| resumed as soon as the situation is resolved. In case your handler fails, you |
| can resume IO with the <option>resume-io</option> command.</para> |
| </listitem> |
| </varlistentry> |
| </variablelist> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="disk-barrier"><option>disk-barrier</option></term> |
| |
| <term xml:id="disk-flushes"><option>disk-flushes</option></term> |
| |
| <term xml:id="disk-drain"><option>disk-drain</option></term> |
| |
| <listitem> |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| <secondary>disk-barrier</secondary> |
| </indexterm> |
| |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>disk-flushes</secondary> |
| </indexterm> |
| |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| <secondary>disk-drain</secondary> |
| </indexterm> |
| |
| <para>DRBD has four implementations to express write-after-write dependencies to its |
| backing storage device. DRBD will use the first method that is supported by the |
| backing storage device and that is not disabled. By default the <emphasis>flush</emphasis> |
| method is used.</para> |
| |
| <para>Since drbd-8.4.2 <option>disk-barrier</option> is disabled by default |
| because since linux-2.6.36 (or 2.6.32 RHEL6) there is no reliable way to determine if queuing |
| of IO-barriers works. <emphasis>Dangerous</emphasis> only enable if you are |
| told so by one that knows for sure.</para> |
| |
| <para>When selecting the method you should not only base your decision on the |
| measurable performance. In case your backing storage device has a volatile write cache |
| (plain disks, RAID of plain disks) you should use one of the first two. In case your |
| backing storage device has battery-backed write cache you may go with option 3. |
| Option 4 (disable everything, use "none") <emphasis>is dangerous</emphasis> |
| on most IO stacks, may result in write-reordering, and if so, |
| can theoretically be the reason for data corruption, or disturb |
| the DRBD protocol, causing spurious disconnect/reconnect cycles. |
| <emphasis>Do not use</emphasis> <option>no-disk-drain</option>.</para> |
| |
| <para>Unfortunately device mapper (LVM) might not support barriers.</para> |
| |
| <para>The letter after "wo:" in /proc/drbd indicates with method is currently in use |
| for a device: <option>b</option>, <option>f</option>, <option>d</option>, |
| <option>n</option>. The implementations are:</para> |
| |
| <variablelist> |
| <varlistentry> |
| <term>barrier</term> |
| |
| <listitem> |
| <para>The first requires that the driver of the backing storage device support |
| barriers (called 'tagged command queuing' in SCSI and 'native command queuing' |
| in SATA speak). The use of this method can be enabled by setting the |
| <option>disk-barrier</option> options to <option>yes</option>.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term>flush</term> |
| |
| <listitem> |
| <para>The second requires that the backing device support disk flushes (called |
| 'force unit access' in the drive vendors speak). The use of this method can be |
| disabled setting <option>disk-flushes</option> to <option>no</option>.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term>drain</term> |
| |
| <listitem> |
| <para>The third method is simply to let write requests drain before write |
| requests of a new reordering domain are issued. This was the only implementation |
| before 8.0.9.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term>none</term> |
| |
| <listitem> |
| <para>The fourth method is to not express write-after-write dependencies to |
| the backing store at all, by also specifying <option>no-disk-drain</option>. |
| This <emphasis>is dangerous</emphasis> |
| on most IO stacks, may result in write-reordering, and if so, |
| can theoretically be the reason for data corruption, or disturb |
| the DRBD protocol, causing spurious disconnect/reconnect cycles. |
| <emphasis>Do not use</emphasis> <option>no-disk-drain</option>.</para> |
| </listitem> |
| </varlistentry> |
| </variablelist> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="md-flushes"><option>md-flushes</option></term> |
| |
| <listitem> |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>md-flushes</secondary> |
| </indexterm> |
| |
| <para>Disables the use of disk flushes and barrier BIOs when accessing the meta data |
| device. See the notes on <option>disk-flushes</option>.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="max-bio-bvecs"><option>max-bio-bvecs</option></term> |
| |
| <listitem> |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>max-bio-bvecs</secondary> |
| </indexterm> |
| |
| <para>In some special circumstances the device mapper stack manages to pass BIOs to |
| DRBD that violate the constraints that are set forth by DRBD's merge_bvec() function |
| and which have more than one bvec. A known example is: phys-disk -> DRBD -> LVM |
| -> Xen -> misaligned partition (63) -> DomU FS. Then you might see "bio would |
| need to, but cannot, be split:" in the Dom0's kernel log.</para> |
| |
| <para>The best workaround is to proper align the partition within the VM (E.g. start |
| it at sector 1024). This costs 480 KiB of storage. Unfortunately the default of most |
| Linux partitioning tools is to start the first partition at an odd number (63). |
| Therefore most distribution's install helpers for virtual linux machines will end up |
| with misaligned partitions. The second best workaround is to limit DRBD's max bvecs |
| per BIO (= <option>max-bio-bvecs</option>) to 1, but that might cost |
| performance.</para> |
| |
| <para>The default value of <option>max-bio-bvecs</option> is 0, which means that there |
| is no user imposed limitation.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term> |
| <option>disk-timeout</option> |
| </term> |
| <listitem> |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| <secondary>disk-timeout</secondary> |
| </indexterm> |
| <para>If the lower-level device on which a DRBD device stores its data does |
| not finish an I/O request within the defined |
| <option>disk-timeout</option>, DRBD treats this as a failure. The |
| lower-level device is detached, and the device's disk state advances to |
| Diskless. If DRBD is connected to one or more peers, the failed request |
| is passed on to one of them.</para> |
| |
| <para>This option is <emphasis>dangerous and may lead to kernel panic!</emphasis></para> |
| |
| <para>"Aborting" requests, or force-detaching the disk, is intended for |
| completely blocked/hung local backing devices which do no longer |
| complete requests at all, not even do error completions. In this |
| situation, usually a hard-reset and failover is the only way out.</para> |
| |
| <para>By "aborting", basically faking a local error-completion, |
| we allow for a more graceful swichover by cleanly migrating services. |
| Still the affected node has to be rebooted "soon".</para> |
| <para>By completing these requests, we allow the upper layers to re-use |
| the associated data pages.</para> |
| |
| <para>If later the local backing device "recovers", and now DMAs some data |
| from disk into the original request pages, in the best case it will |
| just put random data into unused pages; but typically it will corrupt |
| meanwhile completely unrelated data, causing all sorts of damage.</para> |
| |
| <para>Which means delayed successful completion, |
| especially for READ requests, is a reason to panic(). |
| We assume that a delayed *error* completion is OK, |
| though we still will complain noisily about it.</para> |
| <para>The default value of |
| <option>disk-timeout</option> is 0, which stands for an infinite timeout. |
| Timeouts are specified in units of 0.1 seconds. This option is available |
| since DRBD 8.3.12.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="discard-zeroes-if-aligned"><option>discard-zeroes-if-aligned <group choice="req" rep="norepeat"> |
| <arg choice="plain" rep="norepeat">yes</arg> |
| <arg choice="plain" rep="norepeat">no</arg> |
| </group></option></term> |
| <listitem> |
| <para> |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| <secondary>discard-zeroes-if-aligned</secondary> |
| </indexterm> |
| There are several aspects to discard/trim/unmap support on linux |
| block devices. Even if discard is supported in general, it may fail |
| silently, or may partially ignore discard requests. Devices also |
| announce whether reading from unmapped blocks returns defined data |
| (usually zeroes), or undefined data (possibly old data, possibly |
| garbage). |
| </para><para> |
| If on different nodes, DRBD is backed by devices with differing discard |
| characteristics, discards may lead to data divergence (old data or |
| garbage left over on one backend, zeroes due to unmapped areas on the |
| other backend). Online verify would now potentially report tons of |
| spurious differences. While probably harmless for most use cases |
| (fstrim on a file system), DRBD cannot have that. |
| </para><para> |
| To play safe, we have to disable discard support, if our local backend |
| (on a Primary) does not support "discard_zeroes_data=true". We also have to |
| translate discards to explicit zero-out on the receiving side, unless |
| the receiving side (Secondary) supports "discard_zeroes_data=true", |
| thereby allocating areas what were supposed to be unmapped. |
| </para><para> |
| There are some devices (notably the LVM/DM thin provisioning) that are |
| capable of discard, but announce discard_zeroes_data=false. In the case of |
| DM-thin, discards aligned to the chunk size will be unmapped, and |
| reading from unmapped sectors will return zeroes. However, unaligned |
| partial head or tail areas of discard requests will be silently ignored. |
| </para><para> |
| If we now add a helper to explicitly zero-out these unaligned partial |
| areas, while passing on the discard of the aligned full chunks, we |
| effectively achieve discard_zeroes_data=true on such devices. |
| </para><para> |
| Setting <option>discard-zeroes-if-aligned</option> to <option>yes</option> |
| will allow DRBD to use discards, and to announce discard_zeroes_data=true, |
| even on backends that announce discard_zeroes_data=false. |
| </para><para> |
| Setting <option>discard-zeroes-if-aligned</option> to <option>no</option> |
| will cause DRBD to always fall-back to zero-out on the receiving side, |
| and to not even announce discard capabilities on the Primary, |
| if the respective backend announces discard_zeroes_data=false. |
| </para><para> |
| We used to ignore the discard_zeroes_data setting completely. To not |
| break established and expected behaviour, and suddenly cause fstrim on |
| thin-provisioned LVs to run out-of-space instead of freeing up space, |
| the default value is <option>yes</option>. |
| </para><para> |
| This option is available since 8.4.7. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term> |
| <option>read-balancing <replaceable>method</replaceable></option> |
| </term> |
| <listitem> |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| <secondary>read-balancing</secondary> |
| </indexterm> |
| <para> |
| The supported <replaceable>methods</replaceable> for load balancing of |
| read requests are <option>prefer-local</option>, <option>prefer-remote</option>, |
| <option>round-robin</option>, <option>least-pending</option>, |
| <option>when-congested-remote</option>, <option>32K-striping</option>, |
| <option>64K-striping</option>, <option>128K-striping</option>, |
| <option>256K-striping</option>, <option>512K-striping</option> |
| and <option>1M-striping</option>.</para> |
| <para> The default value of is <option>prefer-local</option>. |
| This option is available since 8.4.1. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="rs-discard-granularity"> |
| <option>rs-discard-granularity <replaceable>byte</replaceable></option> |
| </term> |
| <listitem> |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| <secondary>rs-discard-granularity</secondary> |
| </indexterm> |
| <para> |
| When <option>rs-discard-granularity</option> is set to a non zero, positive |
| value then DRBD tries to do a resync operation in requests of this size. |
| In case such a block contains only zero bytes on the sync source node, |
| the sync target node will issue a discard/trim/unmap command for |
| the area.</para> |
| <para>The value is constrained by the discard granularity of the backing |
| block device. In case <option>rs-discard-granularity</option> is not a |
| multiplier of the discard granularity of the backing block device DRBD |
| rounds it up. The feature only gets active if the backing block device |
| reads back zeroes after a discard command.</para> |
| <para> The default value of is 0. This option is available since 8.4.7. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="sndbuf-size"><option>sndbuf-size <replaceable>size</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>sndbuf-size</secondary> |
| </indexterm><replaceable>size</replaceable> is the size of the TCP socket send |
| buffer. The default value is 0, i.e. autotune. You can specify smaller or larger |
| values. Larger values are appropriate for reasonable write throughput with protocol A |
| over high latency networks. Values below 32K do not make sense. Since 8.0.13 resp. |
| 8.2.7, setting the <replaceable>size</replaceable> value to 0 means that the kernel |
| should autotune this.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="rcvbuf-size"><option>rcvbuf-size <replaceable>size</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>rcvbuf-size</secondary> |
| </indexterm><replaceable>size</replaceable> is the size of the TCP socket receive |
| buffer. The default value is 0, i.e. autotune. You can specify smaller or larger |
| values. Usually this should be left at its default. Setting the |
| <replaceable>size</replaceable> value to 0 means that the kernel should autotune |
| this.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="timeout"><option>timeout <replaceable>time</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>timeout</secondary> |
| </indexterm> If the partner node fails to send an expected response packet within |
| <replaceable>time</replaceable> tenths of a second, the partner node is considered |
| dead and therefore the TCP/IP connection is abandoned. This must be lower than |
| <replaceable>connect-int</replaceable> and <replaceable>ping-int</replaceable>. The |
| default value is 60 = 6 seconds, the unit 0.1 seconds.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="connect-int"><option>connect-int <replaceable>time</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>connect-int</secondary> |
| </indexterm> In case it is not possible to connect to the remote DRBD device |
| immediately, DRBD keeps on trying to connect. With this option you can set the time |
| between two retries. The default value is 10 seconds, the unit is 1 second.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="ping-int"><option>ping-int <replaceable>time</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>ping-int</secondary> |
| </indexterm> If the TCP/IP connection linking a DRBD device pair is idle for more |
| than <replaceable>time</replaceable> seconds, DRBD will generate a keep-alive packet |
| to check if its partner is still alive. The default is 10 seconds, the unit is 1 |
| second.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="ping-timeout"><option>ping-timeout <replaceable>time</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>ping-timeout</secondary> |
| </indexterm> The time the peer has time to answer to a keep-alive packet. In case |
| the peer's reply is not received within this time period, it is considered as dead. |
| The default value is 500ms, the default unit are tenths of a second.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="max-buffers"><option>max-buffers <replaceable>number</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>max-buffers</secondary> |
| </indexterm> |
| Limits the memory usage per DRBD minor device on the receiving side, |
| or for internal buffers during resync or online-verify. |
| Unit is PAGE_SIZE, which is 4 KiB on most systems. |
| The minimum possible setting is hard coded to 32 (=128 KiB). |
| These buffers are used to hold data blocks while they are written to/read from disk. |
| To avoid possible distributed deadlocks on congestion, this setting is used |
| as a throttle threshold rather than a hard limit. Once more than max-buffers |
| pages are in use, further allocation from this pool is throttled. |
| You want to increase max-buffers if you cannot saturate the IO backend on the |
| receiving side. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="ko-count"><option>ko-count <replaceable>number</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>ko-count</secondary> |
| </indexterm> In case the secondary node fails to complete a single write request for |
| <replaceable>count</replaceable> times the <replaceable>timeout</replaceable>, it is |
| expelled from the cluster. (I.e. the primary node will kill and restart the connection.) |
| To disable this feature, you should explicitly set it to 0; defaults may change between versions. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="max-epoch-size"><option>max-epoch-size <replaceable>number</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>max-epoch-size</secondary> |
| </indexterm> The highest number of data blocks between two write barriers. If you |
| set this smaller than 10, you might decrease your performance.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="allow-two-primaries"><option>allow-two-primaries</option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>allow-two-primaries</secondary> |
| </indexterm> With this option set you may assign the primary role to both nodes. You |
| only should use this option if you use a shared storage file system on top of DRBD. At |
| the time of writing the only ones are: OCFS2 and GFS. If you use this option with any |
| other file system, you are going to crash your nodes and to corrupt your data!</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="unplug-watermark"><option>unplug-watermark <replaceable>number</replaceable></option></term> |
| |
| <listitem> |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>unplug-watermark</secondary> |
| </indexterm> |
| <para> |
| This setting has no effect with recent kernels that use explicit on-stack |
| plugging (upstream Linux kernel 2.6.39, distributions may have backported). |
| </para> |
| <para>When the number of pending write requests on the standby (secondary) node |
| exceeds the <option>unplug-watermark</option>, we trigger the request processing of |
| our backing storage device. Some storage controllers deliver better performance with |
| small values, others deliver best performance when the value is set to the same value |
| as max-buffers, yet others don't feel much effect at all. |
| Minimum 16, default 128, maximum 131072.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="cram-hmac-alg"><option>cram-hmac-alg</option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>cram-hmac-alg</secondary> |
| </indexterm> You need to specify the HMAC algorithm to enable peer authentication at |
| all. You are strongly encouraged to use peer authentication. The HMAC algorithm will |
| be used for the challenge response authentication of the peer. You may specify any |
| digest algorithm that is named in <option>/proc/crypto</option>.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="shared-secret"><option>shared-secret</option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>shared-secret</secondary> |
| </indexterm> The shared secret used in peer authentication. May be up to 64 |
| characters. Note that peer authentication is disabled as long as no |
| <option>cram-hmac-alg</option> (see above) is specified.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="after-sb-0pri"><option>after-sb-0pri </option> <replaceable>policy</replaceable></term> |
| |
| <listitem> |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>after-sb-0pri</secondary> |
| </indexterm> |
| |
| <para>possible policies are:</para> |
| |
| <variablelist> |
| <varlistentry> |
| <term xml:id="disconnect"><option>disconnect</option></term> |
| |
| <listitem> |
| <para>No automatic resynchronization, simply disconnect.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="discard-younger-primary"><option>discard-younger-primary</option></term> |
| |
| <listitem> |
| <para>Auto sync from the node that was primary before the split-brain situation |
| happened.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="discard-older-primary"><option>discard-older-primary</option></term> |
| |
| <listitem> |
| <para>Auto sync from the node that became primary as second during the |
| split-brain situation.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="discard-zero-changes"><option>discard-zero-changes</option></term> |
| |
| <listitem> |
| <para>In case one node did not write anything since the split brain became |
| evident, sync from the node that wrote something to the node that did not write |
| anything. In case none wrote anything this policy uses a random decision to |
| perform a "resync" of 0 blocks. In case both have written something this policy |
| disconnects the nodes.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="discard-least-changes"><option>discard-least-changes</option></term> |
| |
| <listitem> |
| <para>Auto sync from the node that touched more blocks during the split brain |
| situation.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="discard-node-NODENAME"><option>discard-node-NODENAME</option></term> |
| |
| <listitem> |
| <para>Auto sync to the named node.</para> |
| </listitem> |
| </varlistentry> |
| </variablelist> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="after-sb-1pri"><option>after-sb-1pri </option> <replaceable>policy</replaceable></term> |
| |
| <listitem> |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>after-sb-1pri</secondary> |
| </indexterm> |
| |
| <para>possible policies are:</para> |
| |
| <variablelist> |
| <varlistentry> |
| <term xml:id="sb1-disconnect"><option>disconnect</option></term> |
| |
| <listitem> |
| <para>No automatic resynchronization, simply disconnect.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="consensus"><option>consensus</option></term> |
| |
| <listitem> |
| <para>Discard the version of the secondary if the outcome of the |
| <option>after-sb-0pri</option> algorithm would also destroy the current |
| secondary's data. Otherwise disconnect.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="sb1-violently-as0p"><option>violently-as0p</option></term> |
| |
| <listitem> |
| <para>Always take the decision of the <option>after-sb-0pri</option> algorithm, |
| even if that causes an erratic change of the primary's view of the data. This is |
| only useful if you use a one-node FS (i.e. not OCFS2 or GFS) with the |
| <option>allow-two-primaries</option> flag, <emphasis>AND</emphasis> if you |
| really know what you are doing. This is <emphasis>DANGEROUS and MAY CRASH YOUR |
| MACHINE</emphasis> if you have an FS mounted on the primary node.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="discard-secondary"><option>discard-secondary</option></term> |
| |
| <listitem> |
| <para>Discard the secondary's version.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="sb1-call-pri-lost-after-sb"><option>call-pri-lost-after-sb</option></term> |
| |
| <listitem> |
| <para>Always honor the outcome of the <option>after-sb-0pri </option> algorithm. |
| In case it decides the current secondary has the right data, it calls the |
| "pri-lost-after-sb" handler on the current primary.</para> |
| </listitem> |
| </varlistentry> |
| </variablelist> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="after-sb-2pri"><option>after-sb-2pri </option> <replaceable>policy</replaceable></term> |
| |
| <listitem> |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>after-sb-2pri</secondary> |
| </indexterm> |
| |
| <para>possible policies are:</para> |
| |
| <variablelist> |
| <varlistentry> |
| <term xml:id="sb2-disconnect"><option>disconnect</option></term> |
| |
| <listitem> |
| <para>No automatic resynchronization, simply disconnect.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="sb2-violently-as0p"><option>violently-as0p</option></term> |
| |
| <listitem> |
| <para>Always take the decision of the <option>after-sb-0pri</option> algorithm, |
| even if that causes an erratic change of the primary's view of the data. This is |
| only useful if you use a one-node FS (i.e. not OCFS2 or GFS) with the |
| <option>allow-two-primaries</option> flag, <emphasis>AND</emphasis> if you |
| really know what you are doing. This is <emphasis>DANGEROUS and MAY CRASH YOUR |
| MACHINE</emphasis> if you have an FS mounted on the primary node.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="sb2-call-pri-lost-after-sb"><option>call-pri-lost-after-sb</option></term> |
| |
| <listitem> |
| <para>Call the "pri-lost-after-sb" helper program on one of the machines. This |
| program is expected to reboot the machine, i.e. make it secondary.</para> |
| </listitem> |
| </varlistentry> |
| </variablelist> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="always-asbp"><option>always-asbp</option></term> |
| |
| <listitem> |
| <para>Normally the automatic after-split-brain policies are only used if current |
| states of the UUIDs do not indicate the presence of a third node.</para> |
| |
| <para>With this option you request that the automatic after-split-brain policies are |
| used as long as the data sets of the nodes are somehow related. This might cause a |
| full sync, if the UUIDs indicate the presence of a third node. (Or double faults led |
| to strange UUID sets.)</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="rr-conflict"><option>rr-conflict </option> <replaceable>policy</replaceable></term> |
| |
| <listitem> |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>rr-conflict</secondary> |
| </indexterm> |
| |
| <para>This option helps to solve the cases when the outcome of the resync decision is |
| incompatible with the current role assignment in the cluster.</para> |
| |
| <variablelist> |
| <varlistentry> |
| <term xml:id="rr-disconnect"><option>disconnect</option></term> |
| |
| <listitem> |
| <para>No automatic resynchronization, simply disconnect.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="violently"><option>violently</option></term> |
| |
| <listitem> |
| <para>Sync to the primary node is allowed, violating the assumption that data on |
| a block device are stable for one of the nodes. <emphasis>Dangerous, do not |
| use.</emphasis></para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="call-pri-lost"><option>call-pri-lost</option></term> |
| |
| <listitem> |
| <para>Call the <option>pri-lost-after-sb</option> helper |
| program on one of the machines unless that machine can |
| demote to secondary. The helper program is expected to |
| reboot the machine, which brings the node into a secondary |
| role. Which machine runs the helper program is determined |
| by the <option>after-sb-0pri</option> strategy.</para> |
| </listitem> |
| </varlistentry> |
| </variablelist> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="data-integrity-alg"><option>data-integrity-alg </option> <replaceable>alg</replaceable></term> |
| |
| <listitem> |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>data-integrity-alg</secondary> |
| </indexterm> |
| |
| <para>DRBD can ensure the data integrity of the user's data on the network by |
| comparing hash values. Normally this is ensured by the 16 bit checksums in the headers |
| of TCP/IP packets.</para> |
| |
| <para>This option can be set to any of the kernel's data digest algorithms. In a |
| typical kernel configuration you should have at least one of <option>md5</option>, |
| <option>sha1</option>, and <option>crc32c</option> available. By default this is not |
| enabled.</para> |
| |
| <para>See also the notes on data integrity.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="tcp-cork"><option>tcp-cork</option></term> |
| |
| <listitem> |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>tcp-cork</secondary> |
| </indexterm> |
| |
| <para>DRBD usually uses the TCP socket option TCP_CORK to hint to the network stack |
| when it can expect more data, and when it should flush out what it has in its send |
| queue. It turned out that there is at least one network stack that performs worse when |
| one uses this hinting method. Therefore we introducted this option. By setting |
| <option>tcp-cork</option> to <option>no</option> you can disable the setting and |
| clearing of the TCP_CORK socket option by DRBD.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="on-congestion"><option>on-congestion <replaceable>congestion_policy</replaceable></option></term> |
| |
| <term xml:id="congestion-fill"><option>congestion-fill <replaceable>fill_threshold</replaceable></option></term> |
| |
| <term xml:id="congestion-extents"><option>congestion-extents |
| <replaceable>active_extents_threshold</replaceable></option></term> |
| |
| <listitem> |
| <para>By default DRBD blocks when the available TCP send queue becomes full. That |
| means it will slow down the application that generates the write requests that cause |
| DRBD to send more data down that TCP connection.</para> |
| |
| <para>When DRBD is deployed with DRBD-proxy it might be more desirable that DRBD goes |
| into AHEAD/BEHIND mode shortly before the send queue becomes full. In AHEAD/BEHIND |
| mode DRBD does no longer replicate data, but still keeps the connection open.</para> |
| |
| <para>The advantage of the AHEAD/BEHIND mode is that the application is not slowed |
| down, even if DRBD-proxy's buffer is not sufficient to buffer all write requests. The |
| downside is that the peer node falls behind, and that a resync will be necessary to |
| bring it back into sync. During that resync the peer node will have an inconsistent |
| disk.</para> |
| |
| <para>Available <replaceable>congestion_policy</replaceable>s are |
| <option>block</option> and <option>pull-ahead</option>. The default is |
| <option>block</option>. <replaceable>Fill_threshold</replaceable> might be in the |
| range of 0 to 10GiBytes. The default is 0 which disables the check. |
| <replaceable>Active_extents_threshold</replaceable> has the same limits as |
| <option>al-extents</option>.</para> |
| |
| <para>The AHEAD/BEHIND mode and its settings are available since DRBD 8.3.10.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="wfc-timeout"><option>wfc-timeout <replaceable>time</replaceable></option></term> |
| |
| <listitem> |
| <para>Wait for connection timeout. <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>wfc-timeout</secondary> |
| </indexterm> The init script <citerefentry> |
| <refentrytitle>drbd</refentrytitle> |
| |
| <manvolnum>8</manvolnum> |
| </citerefentry> blocks the boot process until the DRBD resources are connected. When |
| the cluster manager starts later, it does not see a resource with internal |
| split-brain. In case you want to limit the wait time, do it here. Default is 0, which |
| means unlimited. The unit is seconds.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="degr-wfc-timeout"><option>degr-wfc-timeout <replaceable>time</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>degr-wfc-timeout</secondary> |
| </indexterm> Wait for connection timeout, if this node was a degraded cluster. In |
| case a degraded cluster (= cluster with only one node left) is rebooted, this timeout |
| value is used instead of wfc-timeout, because the peer is less likely to show up in |
| time, if it had been dead before. Value 0 means unlimited.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="outdated-wfc-timeout"><option>outdated-wfc-timeout <replaceable>time</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>outdated-wfc-timeout</secondary> |
| </indexterm> Wait for connection timeout, if the peer was outdated. In case a |
| degraded cluster (= cluster with only one node left) with an outdated peer disk is |
| rebooted, this timeout value is used instead of wfc-timeout, because the peer is not |
| allowed to become primary in the meantime. Value 0 means unlimited.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="wait-after-sb"><option>wait-after-sb</option></term> |
| |
| <listitem> |
| <para>By setting this option you can make the init script to continue to wait even if |
| the device pair had a split brain situation and therefore refuses to connect.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="become-primary-on"><option>become-primary-on <replaceable>node-name</replaceable></option></term> |
| |
| <listitem> |
| <para>Sets on which node the device should be promoted to primary role by the init |
| script. The <replaceable>node-name</replaceable> might either be a host name or the |
| keyword <option>both</option>. When this option is not set the devices stay in |
| secondary role on both nodes. Usually one delegates the role assignment to a cluster |
| manager (e.g. heartbeat).</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="stacked-timeouts"><option>stacked-timeouts</option></term> |
| |
| <listitem> |
| <para>Usually <option>wfc-timeout</option> and <option>degr-wfc-timeout</option> are |
| ignored for stacked devices, instead twice the amount of <option>connect-int</option> |
| is used for the connection timeouts. With the <option>stacked-timeouts</option> |
| keyword you disable this, and force DRBD to mind the <option>wfc-timeout</option> and |
| <option>degr-wfc-timeout</option> statements. Only do that if the peer of the stacked |
| resource is usually not available or will usually not become primary. By using this |
| option incorrectly, you run the risk of causing unexpected split brain.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="resync-rate"><option>resync-rate <replaceable>rate</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>resync-rate</secondary> |
| </indexterm> To ensure a smooth operation of the application on top of DRBD, it is |
| possible to limit the bandwidth which may be used by background synchronizations. The |
| default is 250 KB/sec, the default unit is KB/sec. Optional suffixes K, M, G are |
| allowed.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="use-rle"><option>use-rle</option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>use-rle</secondary> |
| </indexterm> During resync-handshake, the dirty-bitmaps of the nodes are exchanged |
| and merged (using bit-or), so the nodes will have the same understanding of which |
| blocks are dirty. On large devices, the fine grained dirty-bitmap can become large as |
| well, and the bitmap exchange can take quite some time on low-bandwidth links.</para> |
| |
| <para>Because the bitmap typically contains compact areas where all bits are unset |
| (clean) or set (dirty), a simple run-length encoding scheme can considerably reduce |
| the network traffic necessary for the bitmap exchange.</para> |
| |
| <para>For backward compatibilty reasons, and because on fast links this possibly does |
| not improve transfer time but consumes cpu cycles, this defaults to off.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="socket-check-timeout"><option>socket-check-timeout <replaceable>value</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>socket-check-timeout</secondary> |
| </indexterm> In setups involving a DRBD-proxy and connections that experience a lot of |
| buffer-bloat it might be necessary to set <option>ping-timeout</option> to an |
| unusual high value. By default DRBD uses the same value to wait if a newly |
| established TCP-connection is stable. Since the DRBD-proxy is usually located |
| in the same data center such a long wait time may hinder DRBD's connect process. |
| </para> |
| <para>In such setups <option>socket-check-timeout</option> should be set to |
| at least to the round trip time between DRBD and DRBD-proxy. I.e. in most |
| cases to 1.</para> |
| <para> |
| The default unit is tenths of a second, the default value is 0 (which causes |
| DRBD to use the value of <option>ping-timeout</option> instead). |
| Introduced in 8.4.5.</para> |
| |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="resync-after"><option>resync-after <replaceable>res-name</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>resync-after</secondary> |
| </indexterm> By default, resynchronization of all devices would run in parallel. By |
| defining a resync-after dependency, the resynchronization of this resource will start |
| only if the resource <replaceable>res-name</replaceable> is already in connected state |
| (i.e., has finished its resynchronization).</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="al-extents"><option>al-extents <replaceable>extents</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>al-extents</secondary> |
| </indexterm> DRBD automatically performs hot area detection. With this parameter you |
| control how big the hot area (= active set) can get. Each extent marks 4M of the |
| backing storage (= low-level device). In case a primary node leaves the cluster |
| unexpectedly, the areas covered by the active set must be resynced upon rejoining of |
| the failed node. The data structure is stored in the meta-data area, therefore each |
| change of the active set is a write operation to the meta-data device. A higher number |
| of extents gives longer resync times but less updates to the meta-data. The default |
| number of <replaceable>extents</replaceable> is 1237. (Minimum: 7, Maximum: |
| 65534)</para> |
| <para> |
| Note that the effective maximum may be smaller, depending on how |
| you created the device meta data, see also |
| <citerefentry><refentrytitle>drbdmeta</refentrytitle><manvolnum>8</manvolnum></citerefentry>. |
| The effective maximum is 919 * (available on-disk activity-log ring-buffer area/4kB -1), |
| the default 32kB ring-buffer effects a maximum of 6433 (covers more than 25 GiB of data). |
| We recommend to keep this well within the amount your backend storage |
| and replication link are able to resync inside of about 5 minutes. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="al-updates"><option>al-updates <group choice="req" rep="norepeat"> |
| <arg choice="plain" rep="norepeat">yes</arg> |
| <arg choice="plain" rep="norepeat">no</arg> |
| </group></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| <secondary>al-updates</secondary> |
| </indexterm> DRBD's activity log transaction writing makes it possible, that |
| after the crash of a primary node a partial (bit-map based) resync is |
| sufficient to bring the node back to up-to-date. |
| Setting <option>al-updates</option> to <option>no</option> might increase |
| normal operation performance but causes DRBD to do a full resync |
| when a crashed primary gets reconnected. The default value is <option>yes</option>. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="verify-alg"><option>verify-alg <replaceable>hash-alg</replaceable></option></term> |
| |
| <listitem> |
| <para>During online verification (as initiated by the <command |
| moreinfo="none">verify</command> sub-command), rather than doing a bit-wise |
| comparison, DRBD applies a hash function to the contents of every block being |
| verified, and compares that hash with the peer. This option defines the hash algorithm |
| being used for that purpose. It can be set to any of the kernel's data digest |
| algorithms. In a typical kernel configuration you should have at least one of |
| <option>md5</option>, <option>sha1</option>, and <option>crc32c</option> available. By |
| default this is not enabled; you must set this option explicitly in order to be able |
| to use on-line device verification.</para> |
| |
| <para>See also the notes on data integrity.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="csums-alg"><option>csums-alg <replaceable>hash-alg</replaceable></option></term> |
| |
| <listitem> |
| <para>A resync process sends all marked data blocks from the source to the destination |
| node, as long as no <option>csums-alg</option> is given. When one is specified the |
| resync process exchanges hash values of all marked blocks first, and sends only those |
| data blocks that have different hash values.</para> |
| |
| <para>This setting is useful for DRBD setups with low bandwidth links. During the |
| restart of a crashed primary node, all blocks covered by the activity log are marked |
| for resync. But a large part of those will actually be still in sync, therefore using |
| <option>csums-alg</option> will lower the required bandwidth in exchange for CPU |
| cycles.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="c-plan-ahead"><option>c-plan-ahead <replaceable>plan_time</replaceable></option></term> |
| |
| <term xml:id="c-fill-target"><option>c-fill-target <replaceable>fill_target</replaceable></option></term> |
| |
| <term xml:id="c-delay-target"><option>c-delay-target <replaceable>delay_target</replaceable></option></term> |
| |
| <term xml:id="c-max-rate"><option>c-max-rate <replaceable>max_rate</replaceable></option></term> |
| |
| <listitem> |
| <para>The dynamic resync speed controller gets enabled with setting |
| <replaceable>plan_time</replaceable> to a positive value. It aims to fill the buffers |
| along the data path with either a constant amount of data |
| <replaceable>fill_target</replaceable>, or aims to have a constant delay time of |
| <replaceable>delay_target</replaceable> along the path. The controller has an upper |
| bound of <replaceable>max_rate</replaceable>.</para> |
| |
| <para>By <replaceable>plan_time</replaceable> the agility of the controller is |
| configured. Higher values yield for slower/lower responses of the controller to |
| deviation from the target value. It should be at least 5 times RTT. For regular data |
| paths a <replaceable>fill_target</replaceable> in the area of 4k to 100k is |
| appropriate. For a setup that contains drbd-proxy it is advisable to use |
| <replaceable>delay_target</replaceable> instead. Only when |
| <replaceable>fill_target</replaceable> is set to 0 the controller will use |
| <replaceable>delay_target</replaceable>. 5 times RTT is a reasonable starting value. |
| <replaceable>Max_rate</replaceable> should be set to the bandwidth available between |
| the DRBD-hosts and the machines hosting DRBD-proxy, or to the available |
| disk-bandwidth.</para> |
| |
| <para>The default value of <replaceable>plan_time</replaceable> is 0, the default unit |
| is 0.1 seconds. <replaceable>Fill_target</replaceable> has 0 and sectors as default |
| unit. <replaceable>Delay_target</replaceable> has 1 (100ms) and 0.1 as default unit. |
| <replaceable>Max_rate</replaceable> has 10240 (100MiB/s) and KiB/s as default |
| unit.</para> |
| |
| <para>The dynamic resync speed controller and its settings are available since DRBD |
| 8.3.9.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="c-min-rate"><option>c-min-rate <replaceable>min_rate</replaceable></option></term> |
| |
| <listitem> |
| <para>A node that is primary and sync-source has to schedule application IO requests |
| and resync IO requests. The <replaceable>min_rate</replaceable> tells DRBD use only up |
| to min_rate for resync IO and to dedicate all other available IO bandwidth to |
| application requests.</para> |
| |
| <para>Note: The value 0 has a special meaning. It disables the limitation of resync IO |
| completely, which might slow down application IO considerably. Set it to a value of 1, |
| if you prefer that resync IO never slows down application IO.</para> |
| |
| <para>Note: Although the name might suggest that it is a lower bound for the dynamic |
| resync speed controller, it is not. If the DRBD-proxy buffer is full, the dynamic |
| resync speed controller is free to lower the resync speed down to 0, completely |
| independent of the <option>c-min-rate</option> setting.</para> |
| |
| <para><replaceable>Min_rate</replaceable> has 4096 (4MiB/s) and KiB/s as default |
| unit.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="on-no-data-accessible"><option>on-no-data-accessible |
| <replaceable>ond-policy</replaceable></option></term> |
| |
| <listitem> |
| <para>This setting controls what happens to IO requests on a degraded, disk less node |
| (I.e. no data store is reachable). The available policies are |
| <option>io-error</option> and <option>suspend-io</option>.</para> |
| |
| <para>If <replaceable>ond-policy</replaceable> is set to <option>suspend-io</option> |
| you can either resume IO by attaching/connecting the last lost data storage, or by the |
| <command moreinfo="none">drbdadm resume-io <replaceable>res</replaceable></command> |
| command. The latter will result in IO errors of course.</para> |
| |
| <para>The default is <option>io-error</option>. This setting is available since DRBD |
| 8.3.9.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="cpu-mask"><option>cpu-mask <replaceable>cpu-mask</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>cpu-mask</secondary> |
| </indexterm> Sets the cpu-affinity-mask for DRBD's kernel threads of this device. |
| The default value of <replaceable>cpu-mask</replaceable> is 0, which means that DRBD's |
| kernel threads should be spread over all CPUs of the machine. This value must be given |
| in hexadecimal notation. If it is too big it will be truncated.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="pri-on-incon-degr"><option>pri-on-incon-degr <replaceable>cmd</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>pri-on-incon-degr</secondary> |
| </indexterm> This handler is called if the node is primary, degraded and if the |
| local copy of the data is inconsistent.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="pri-lost-after-sb"><option>pri-lost-after-sb <replaceable>cmd</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>pri-lost-after-sb</secondary> |
| </indexterm> The node is currently primary, but lost the after-split-brain auto |
| recovery procedure. As as consequence, it should be abandoned.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="pri-lost"><option>pri-lost <replaceable>cmd</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>pri-lost</secondary> |
| </indexterm> The node is currently primary, but DRBD's algorithm thinks that it |
| should become sync target. As a consequence it should give up its primary role.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="fence-peer"><option>fence-peer <replaceable>cmd</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>fence-peer</secondary> |
| </indexterm> The handler is part of the <option>fencing</option> mechanism. This |
| handler is called in case the node needs to fence the peer's disk. It should use other |
| communication paths than DRBD's network link.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="local-io-error"><option>local-io-error <replaceable>cmd</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>local-io-error</secondary> |
| </indexterm> DRBD got an IO error from the local IO subsystem.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="initial-split-brain"><option>initial-split-brain <replaceable>cmd</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>initial-split-brain</secondary> |
| </indexterm> DRBD has connected and detected a split brain situation. This handler |
| can alert someone in all cases of split brain, not just those that go |
| unresolved.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="split-brain"><option>split-brain <replaceable>cmd</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>split-brain</secondary> |
| </indexterm> DRBD detected a split brain situation but remains unresolved. Manual |
| recovery is necessary. This handler should alert someone on duty.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="before-resync-target"><option>before-resync-target <replaceable>cmd</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>before-resync-target</secondary> |
| </indexterm> DRBD calls this handler just before a resync begins on the node that |
| becomes resync target. It might be used to take a snapshot of the backing block |
| device.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="after-resync-target"><option>after-resync-target <replaceable>cmd</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>after-resync-target</secondary> |
| </indexterm> DRBD calls this handler just after a resync operation finished on the |
| node whose disk just became consistent after being inconsistent for the duration of |
| the resync. It might be used to remove a snapshot of the backing device that was |
| created by the <option>before-resync-target</option> handler.</para> |
| </listitem> |
| </varlistentry> |
| </variablelist> |
| </refsect2> |
| |
| <refsect2> |
| <title>Other Keywords</title> |
| |
| <variablelist> |
| <varlistentry> |
| <term xml:id="include"><option>include <replaceable>file-pattern</replaceable></option></term> |
| |
| <listitem> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>include</secondary> |
| </indexterm> Include all files matching the wildcard pattern |
| <replaceable>file-pattern</replaceable>. The <option>include</option> statement is |
| only allowed on the top level, i.e. it is not allowed inside any section.</para> |
| </listitem> |
| </varlistentry> |
| </variablelist> |
| </refsect2> |
| </refsect1> |
| |
| <refsect1 id="data-integrity"> |
| <title>Notes on data integrity</title> |
| |
| <para>There are two independent methods in DRBD to ensure the integrity of the mirrored data. |
| The online-verify mechanism and the <option>data-integrity-alg</option> of the |
| <option>network</option> section.</para> |
| |
| <para>Both mechanisms might deliver false positives if the user of DRBD modifies the data |
| which gets written to disk while the transfer goes on. This may happen for swap, or for |
| certain append while global sync, or truncate/rewrite workloads, and not necessarily poses a |
| problem for the integrity of the data. Usually when the initiator of the data transfer does |
| this, it already knows that that data block will not be part of an on disk data structure, or |
| will be resubmitted with correct data soon enough.</para> |
| |
| <para>The <option>data-integrity-alg</option> causes the receiving side to log an error about |
| "Digest integrity check FAILED: Ns +x\n", where N is the sector offset, and x is the size of |
| the request in bytes. It will then disconnect, and reconnect, thus causing a quick resync. If |
| the sending side at the same time detected a modification, it warns about "Digest mismatch, |
| buffer modified by upper layers during write: Ns +x\n", which shows that this was a false |
| positive. The sending side may detect these buffer modifications immediately after the |
| unmodified data has been copied to the tcp buffers, in which case the receiving side won't |
| notice it.</para> |
| |
| <para>The most recent (2007) example of systematic corruption was an issue with the TCP |
| offloading engine and the driver of a certain type of GBit NIC. The actual corruption happened |
| on the DMA transfer from core memory to the card. Since the TCP checksum gets calculated on |
| the card, this type of corruption stays undetected as long as you do not use either the online |
| <option>verify</option> or the <option>data-integrity-alg</option>.</para> |
| |
| <para>We suggest to use the <option>data-integrity-alg</option> only during a pre-production |
| phase due to its CPU costs. Further we suggest to do online <option>verify</option> runs |
| regularly e.g. once a month during a low load period.</para> |
| </refsect1> |
| |
| <refsect1> |
| <title>Version</title> |
| |
| <simpara>This document was revised for version 8.4.0 of the DRBD distribution.</simpara> |
| </refsect1> |
| |
| <refsect1> |
| <title>Author</title> |
| |
| <simpara>Written by Philipp Reisner <email>philipp.reisner@linbit.com</email> and Lars |
| Ellenberg <email>lars.ellenberg@linbit.com</email>.</simpara> |
| </refsect1> |
| |
| <refsect1> |
| <title>Reporting Bugs</title> |
| |
| <simpara>Report bugs to <email>drbd-user@lists.linbit.com</email>.</simpara> |
| </refsect1> |
| |
| <refsect1> |
| <title>Copyright</title> |
| |
| <simpara>Copyright 2001-2008 LINBIT Information Technologies, Philipp Reisner, Lars Ellenberg. |
| This is free software; see the source for copying conditions. There is NO warranty; not even |
| for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.</simpara> |
| </refsect1> |
| |
| <refsect1> |
| <title>See Also</title> |
| |
| <para> |
| <citerefentry><refentrytitle>drbd</refentrytitle><manvolnum>8</manvolnum></citerefentry>, |
| <citerefentry><refentrytitle>drbddisk</refentrytitle><manvolnum>8</manvolnum></citerefentry>, |
| <citerefentry><refentrytitle>drbdsetup</refentrytitle><manvolnum>8</manvolnum></citerefentry>, |
| <citerefentry><refentrytitle>drbdmeta</refentrytitle><manvolnum>8</manvolnum></citerefentry>, |
| <citerefentry><refentrytitle>drbdadm</refentrytitle><manvolnum>8</manvolnum></citerefentry>, |
| <ulink url="http://www.drbd.org/users-guide/"><citetitle>DRBD User's Guide</citetitle></ulink>, |
| <ulink url="http://www.drbd.org/"><citetitle>DRBD web site</citetitle></ulink> |
| </para> |
| |
| </refsect1> |
| </refentry> |