| <drbdsetup_options> |
| <drbdsetup_option name="al-extents"> |
| <term xml:id="al-extents"><option>al-extents <replaceable>extents</replaceable></option> |
| </term> |
| |
| <definition> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>al-extents</secondary> |
| </indexterm> DRBD automatically maintains a "hot" or "active" disk area |
| likely to be written to again soon based on the recent write activity. |
| The "active" disk area can be written to immediately, while "inactive" |
| disk areas must be "activated" first, which requires a meta-data write. |
| We also refer to this active disk area as the "activity log".</para> |
| |
| <para>The activity log saves meta-data writes, but the whole log must be |
| resynced upon recovery of a failed node. The size of the activity log is |
| a major factor of how long a resync will take and how fast a replicated |
| disk will become consistent after a crash.</para> |
| |
| <para>The activity log consists of a number of 4-Megabyte segments; the |
| <replaceable>al-extents</replaceable> parameter determines how many of |
| those segments can be active at the same time. The default value for |
| <replaceable>al-extents</replaceable> is 1237, with a minimum of 7 and a |
| maximum of 65536.</para> |
| <para> |
| Note that the effective maximum may be smaller, depending on how |
| you created the device meta data, see also |
| <citerefentry><refentrytitle>drbdmeta</refentrytitle><manvolnum>8</manvolnum></citerefentry> |
| The effective maximum is 919 * (available on-disk activity-log ring-buffer area/4kB -1), |
| the default 32kB ring-buffer effects a maximum of 6433 (covers more than 25 GiB of data) |
| We recommend to keep this well within the amount your backend storage |
| and replication link are able to resync inside of about 5 minutes. |
| </para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="al-updates"> |
| <term xml:id="al-updates"><option>al-updates |
| <group choice="req" rep="norepeat"> |
| <arg choice="plain" rep="norepeat">yes</arg> |
| <arg choice="plain" rep="norepeat">no</arg> |
| </group> |
| </option> |
| </term> |
| |
| <definition> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| <secondary>al-updates</secondary> |
| </indexterm> With this parameter, the activity log can be turned off |
| entirely (see the <option>al-extents</option> parameter). This will speed |
| up writes because fewer meta-data writes will be necessary, but the |
| entire device needs to be resynchronized opon recovery of a failed |
| primary node. The default value for <option>al-updates</option> is |
| <option>yes</option>. |
| </para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="c-delay-target"> |
| <term xml:id="c-delay-target"><option>c-delay-target <replaceable>delay_target</replaceable></option></term> |
| |
| <term xml:id="c-fill-target"><option>c-fill-target <replaceable>fill_target</replaceable></option></term> |
| |
| <term xml:id="c-max-rate"><option>c-max-rate <replaceable>max_rate</replaceable></option></term> |
| |
| <term xml:id="c-plan-ahead"><option>c-plan-ahead <replaceable>plan_time</replaceable></option></term> |
| |
| <definition> |
| <para>Dynamically control the resync speed. This mechanism is enabled by |
| setting the <option>c-plan-ahead</option> parameter to a positive value. |
| The goal is to either fill the buffers along the data path with a defined |
| amount of data if <option>c-fill-target</option> is defined, or to have a |
| defined delay along the path if <option>c-delay-target</option> is |
| defined. The maximum bandwidth is limited by the |
| <option>c-max-rate</option> parameter.</para> |
| |
| <para>The <option>c-plan-ahead</option> parameter defines how fast drbd |
| adapts to changes in the resync speed. It should be set to five times |
| the network round-trip time or more. Common values for |
| <option>c-fill-target</option> for "normal" data paths range from 4K to |
| 100K. If drbd-proxy is used, it is advised to use |
| <option>c-delay-target</option> instead of <option>c-fill-target</option>. The |
| <option>c-delay-target</option> parameter is used if the |
| <option>c-fill-target</option> parameter is undefined or set to 0. The |
| <option>c-delay-target</option> parameter should be set to five times the |
| network round-trip time or more. The <option>c-max-rate</option> option |
| should be set to either the bandwidth available between the DRBD-hosts and the |
| machines hosting DRBD-proxy, or to the available disk bandwidth.</para> |
| |
| <para>The default values of these parameters are: |
| <option>c-plan-ahead</option> = 20 (in units of 0.1 seconds), |
| <option>c-fill-target</option> = 0 (in units of sectors), |
| <option>c-delay-target</option> = 1 (in units of 0.1 seconds), |
| and <option>c-max-rate</option> = 102400 (in units of KiB/s).</para> |
| |
| <para>Dynamic resync speed control is available since DRBD 8.3.9.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="c-min-rate"> |
| <term xml:id="c-min-rate"><option>c-min-rate <replaceable>min_rate</replaceable></option></term> |
| |
| <definition> |
| <para>A node which is primary and sync-source has to schedule application |
| I/O requests and resync I/O requests. The <option>c-min-rate</option> |
| parameter limits how much bandwidth is available for resync I/O; the |
| remaining bandwidth is used for application I/O.</para> |
| |
| <para>A <option>c-min-rate</option> value of 0 means that there is no |
| limit on the resync I/O bandwidth. This can slow down application I/O |
| significantly. Use a value of 1 (1 KiB/s) for the lowest possible resync |
| rate.</para> |
| |
| <para>The default value of <option>c-min-rate</option> is 4096, in units of |
| KiB/s.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="disk-barrier"> |
| <term xml:id="disk-barrier"><option>disk-barrier</option></term> |
| |
| <term xml:id="disk-flushes"><option>disk-flushes</option></term> |
| |
| <term xml:id="disk-drain"><option>disk-drain</option></term> |
| |
| <definition> |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| <secondary>disk-barrier</secondary> |
| </indexterm> |
| |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>disk-flushes</secondary> |
| </indexterm> |
| |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| <secondary>disk-drain</secondary> |
| </indexterm> |
| |
| <para>DRBD has three methods of handling the ordering of dependent write |
| requests: |
| <variablelist> |
| <varlistentry> |
| <term><option>disk-barrier</option></term> |
| <listitem> |
| <para>Use disk barriers to make sure that requests are written to |
| disk in the right order. Barriers ensure that all requests |
| submitted before a barrier make it to the disk before any |
| requests submitted after the barrier. This is implemented using |
| 'tagged command queuing' on SCSI devices and 'native command |
| queuing' on SATA devices. Only some devices and device stacks |
| support this method. The device mapper (LVM) only supports |
| barriers in some configurations.</para> |
| |
| <para>Note that on systems which do not support |
| disk barriers, enabling this option can lead to data loss or |
| corruption. Until DRBD 8.4.1, <option>disk-barrier</option> was |
| turned on if the I/O stack below DRBD did support barriers. |
| Kernels since linux-2.6.36 (or 2.6.32 RHEL6) no longer allow to |
| detect if barriers are supported. Since drbd-8.4.2, |
| this option is off by default and needs to be enabled explicitly. |
| </para> |
| </listitem> |
| </varlistentry> |
| <varlistentry> |
| <term><option>disk-flushes</option></term> |
| <listitem> |
| <para>Use disk flushes between dependent write requests, also |
| referred to as 'force unit access' by drive vendors. This forces |
| all data to disk. This option is enabled by default. |
| </para> |
| </listitem> |
| </varlistentry> |
| <varlistentry> |
| <term><option>disk-drain</option></term> |
| <listitem> |
| <para>Wait for the request queue to "drain" (that is, wait for |
| the requests to finish) before submitting a dependent write |
| request. This method requires that requests are stable on disk |
| when they finish. Before DRBD 8.0.9, this was the only method |
| implemented. This option is enabled by default. Do not disable |
| in production environments. |
| </para> |
| </listitem> |
| </varlistentry> |
| </variablelist> |
| From these three methods, drbd will use the first that is enabled and |
| supported by the backing storage device. If all three of these options |
| are turned off, DRBD will submit write requests without bothering about |
| dependencies. Depending on the I/O stack, write requests can be |
| reordered, and they can be submitted in a different order on different |
| cluster nodes. This can result in data loss or corruption. Therefore, |
| turning off all three methods of controlling write ordering is strongly |
| discouraged. |
| </para> |
| |
| <para>A general guideline for configuring write ordering is to use disk |
| barriers or disk flushes when using ordinary disks (or an ordinary disk |
| array) with a volatile write cache. On storage without cache or with a |
| battery backed write cache, disk draining can be a reasonable |
| choice.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="disk-timeout"> |
| <term xml:id="disk-timeout"> <option>disk-timeout</option> |
| </term> |
| <definition> |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| <secondary>disk-timeout</secondary> |
| </indexterm> |
| <para>If the lower-level device on which a DRBD device stores its data does |
| not finish an I/O request within the defined |
| <option>disk-timeout</option>, DRBD treats this as a failure. The |
| lower-level device is detached, and the device's disk state advances to |
| Diskless. If DRBD is connected to one or more peers, the failed request |
| is passed on to one of them.</para> |
| |
| <para>This option is <emphasis>dangerous and may lead to kernel panic!</emphasis></para> |
| |
| <para>"Aborting" requests, or force-detaching the disk, is intended for |
| completely blocked/hung local backing devices which do no longer |
| complete requests at all, not even do error completions. In this |
| situation, usually a hard-reset and failover is the only way out.</para> |
| |
| <para>By "aborting", basically faking a local error-completion, |
| we allow for a more graceful swichover by cleanly migrating services. |
| Still the affected node has to be rebooted "soon".</para> |
| <para>By completing these requests, we allow the upper layers to re-use |
| the associated data pages.</para> |
| |
| <para>If later the local backing device "recovers", and now DMAs some data |
| from disk into the original request pages, in the best case it will |
| just put random data into unused pages; but typically it will corrupt |
| meanwhile completely unrelated data, causing all sorts of damage.</para> |
| |
| <para>Which means delayed successful completion, |
| especially for READ requests, is a reason to panic(). |
| We assume that a delayed *error* completion is OK, |
| though we still will complain noisily about it.</para> |
| <para>The default value of |
| <option>disk-timeout</option> is 0, which stands for an infinite timeout. |
| Timeouts are specified in units of 0.1 seconds. This option is available |
| since DRBD 8.3.12.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="fencing"> |
| <term xml:id="fencing"><option>fencing <replaceable>fencing_policy</replaceable></option></term> |
| |
| <definition> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>fencing</secondary> |
| </indexterm> <option>Fencing</option> is a preventive measure to avoid |
| situations where both nodes are primary and disconnected. This is also |
| known as a split-brain situation. DRBD supports the following fencing |
| policies:</para> |
| |
| <variablelist> |
| <varlistentry> |
| <term xml:id="dont-care"><option>dont-care</option></term> |
| |
| <listitem> |
| <para>No fencing actions are taken. This is the default policy.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="resource-only"><option>resource-only</option></term> |
| |
| <listitem> |
| <para>If a node becomes a disconnected primary, it tries to fence the peer. |
| This is done by calling the <option>fence-peer</option> handler. The |
| handler is supposed to reach the peer over an alternative communication path |
| and call '<option>drbdadm outdate minor</option>' there.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="resource-and-stonith"><option>resource-and-stonith</option></term> |
| |
| <listitem> |
| <para>If a node becomes a disconnected primary, it freezes all its IO operations |
| and calls its fence-peer handler. The fence-peer handler is supposed to reach |
| the peer over an alternative communication path and call |
| '<option>drbdadm outdate minor</option>' there. In case it cannot |
| do that, it should stonith the peer. IO is resumed as soon as |
| the situation is resolved. In case the fence-peer handler fails, |
| I/O can be resumed manually with '<option>drbdadm |
| resume-io</option>'.</para> |
| </listitem> |
| </varlistentry> |
| </variablelist> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="md-flushes"> |
| <term xml:id="md-flushes"><option>md-flushes</option></term> |
| |
| <definition> |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>md-flushes</secondary> |
| </indexterm> |
| |
| <para>Enable disk flushes and disk barriers on the meta-data device. |
| This option is enabled by default. See the <option>disk-flushes</option> |
| parameter.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="on-io-error"> |
| <term xml:id="on-io-error"><option>on-io-error <replaceable>handler</replaceable></option></term> |
| |
| <definition> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>on-io-error</secondary> |
| </indexterm> Configure how DRBD reacts to I/O errors on a |
| lower-level device. The following policies are defined: |
| |
| <variablelist> |
| <varlistentry> |
| <term xml:id="pass_on"><option>pass_on</option></term> |
| <listitem> |
| <para>Change the disk status to Inconsistent, mark the failed |
| block as inconsistent in the bitmap, and retry the I/O operation |
| on a remote cluster node.</para> |
| </listitem> |
| </varlistentry> |
| <varlistentry> |
| <term xml:id="call-local-io-error"><option>call-local-io-error</option></term> |
| <listitem> |
| <para>Call the <option>local-io-error</option> handler (see the |
| <option>handlers</option> section).</para> |
| </listitem> |
| </varlistentry> |
| <varlistentry> |
| <term xml:id="detach"><option>detach</option></term> |
| <listitem> |
| <para>Detach the lower-level device and continue in diskless mode. |
| </para> |
| </listitem> |
| </varlistentry> |
| </variablelist> |
| </para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="read-balancing"> |
| <term xml:id="read-balancing"><option>read-balancing <replaceable>policy</replaceable></option> |
| </term> |
| <definition> |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| <secondary>read-balancing</secondary> |
| </indexterm> |
| <para> |
| Distribute read requests among cluster nodes as defined by |
| <replaceable>policy</replaceable>. The supported policies are |
| <option xml:id="prefer-local">prefer-local</option> (the default), |
| <option xml:id="prefer-remote">prefer-remote</option>, <option xml:id="round-robin">round-robin</option>, |
| <option xml:id="least-pending">least-pending</option>, <option xml:id="when-congested-remote">when-congested-remote</option>, |
| <option xml:id="_32K-striping">32K-striping</option>, <option xml:id="_64K-striping">64K-striping</option>, |
| <option xml:id="_128K-striping">128K-striping</option>, <option xml:id="_256K-striping">256K-striping</option>, |
| <option xml:id="_512K-striping">512K-striping</option> and <option xml:id="_1M-striping">1M-striping</option>.</para> |
| <para>This option is available since DRBD 8.4.1.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| |
| <drbdsetup_option name="discard-zeroes-if-aligned"> |
| <term xml:id="discard-zeroes-if-aligned"><option>discard-zeroes-if-aligned <group choice="req" rep="norepeat"> |
| <arg choice="plain" rep="norepeat">yes</arg> |
| <arg choice="plain" rep="norepeat">no</arg> |
| </group></option></term> |
| <definition> |
| <para> |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| <secondary>discard-zeroes-if-aligned</secondary> |
| </indexterm> |
| There are several aspects to discard/trim/unmap support on linux |
| block devices. Even if discard is supported in general, it may fail |
| silently, or may partially ignore discard requests. Devices also |
| announce whether reading from unmapped blocks returns defined data |
| (usually zeroes), or undefined data (possibly old data, possibly |
| garbage). |
| </para><para> |
| If on different nodes, DRBD is backed by devices with differing discard |
| characteristics, discards may lead to data divergence (old data or |
| garbage left over on one backend, zeroes due to unmapped areas on the |
| other backend). Online verify would now potentially report tons of |
| spurious differences. While probably harmless for most use cases |
| (fstrim on a file system), DRBD cannot have that. |
| </para><para> |
| To play safe, we have to disable discard support, if our local backend |
| (on a Primary) does not support "discard_zeroes_data=true". We also have to |
| translate discards to explicit zero-out on the receiving side, unless |
| the receiving side (Secondary) supports "discard_zeroes_data=true", |
| thereby allocating areas what were supposed to be unmapped. |
| </para><para> |
| There are some devices (notably the LVM/DM thin provisioning) that are |
| capable of discard, but announce discard_zeroes_data=false. In the case of |
| DM-thin, discards aligned to the chunk size will be unmapped, and |
| reading from unmapped sectors will return zeroes. However, unaligned |
| partial head or tail areas of discard requests will be silently ignored. |
| </para><para> |
| If we now add a helper to explicitly zero-out these unaligned partial |
| areas, while passing on the discard of the aligned full chunks, we |
| effectively achieve discard_zeroes_data=true on such devices. |
| </para><para> |
| Setting <option>discard-zeroes-if-aligned</option> to <option>yes</option> |
| will allow DRBD to use discards, and to announce discard_zeroes_data=true, |
| even on backends that announce discard_zeroes_data=false. |
| </para><para> |
| Setting <option>discard-zeroes-if-aligned</option> to <option>no</option> |
| will cause DRBD to always fall-back to zero-out on the receiving side, |
| and to not even announce discard capabilities on the Primary, |
| if the respective backend announces discard_zeroes_data=false. |
| </para><para> |
| We used to ignore the discard_zeroes_data setting completely. To not |
| break established and expected behaviour, and suddenly cause fstrim on |
| thin-provisioned LVs to run out-of-space instead of freeing up space, |
| the default value is <option>yes</option>. |
| </para><para> |
| This option is available since 8.4.7. |
| </para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="rs-discard-granularity"> |
| <term> |
| <option>rs-discard-granularity <replaceable>byte</replaceable></option> |
| </term> |
| <definition> |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| <secondary>rs-discard-granularity</secondary> |
| </indexterm> |
| <para> |
| When <option>rs-discard-granularity</option> is set to a non zero, positive |
| value then DRBD tries to do a resync operation in requests of this size. |
| In case such a block contains only zero bytes on the sync source node, |
| the sync target node will issue a discard/trim/unmap command for |
| the area.</para> |
| <para>The value is constrained by the discard granularity of the backing |
| block device. In case <option>rs-discard-granularity</option> is not a |
| multiplier of the discard granularity of the backing block device DRBD |
| rounds it up. The feature only gets active if the backing block device |
| reads back zeroes after a discard command.</para> |
| <para> The default value of is 0. This option is available since 8.4.7. |
| </para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="resync-after"> |
| <term xml:id="resync-after"> |
| <only-drbdsetup> |
| <option>resync-after <replaceable>minor</replaceable></option> |
| </only-drbdsetup> |
| <only-drbd-conf> |
| <option>resync-after <replaceable>res-name</replaceable>/<replaceable>volume</replaceable></option> |
| </only-drbd-conf> |
| </term> |
| |
| <definition> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>resync-after</secondary> |
| </indexterm> Define that a device should only resynchronize after the |
| specified other device. By default, no order between devices is |
| defined, and all devices will resynchronize in parallel. Depending on |
| the configuration of the lower-level devices, and the available |
| network and disk bandwidth, this can slow down the overall resync |
| process. This option can be used to form a chain or tree of |
| dependencies among devices.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="resync-rate"> |
| <term xml:id="resync-rate"><option>resync-rate <replaceable>rate</replaceable></option></term> |
| |
| <definition> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>resync-rate</secondary> |
| </indexterm> Define how much bandwidth DRBD may use for |
| resynchronizing. DRBD allows "normal" application I/O even during a |
| resync. If the resync takes up too much bandwidth, application I/O |
| can become very slow. This parameter allows to avoid that. Please |
| note this is option only works when the dynamic resync controller is |
| disabled.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="size"> |
| <!-- NOTE: This description is neither used in drbd.conf.xml.in nor in |
| drbdsetup.xml.in. --> |
| <term xml:id="size"><option>size <replaceable>size</replaceable></option></term> |
| |
| <definition> |
| <para>Specify the size of the lower-level device explicitly instead of |
| determining it automatically. The device size must be determined once |
| and is remembered for the lifetime of the device. In order to |
| determine it automatically, all the lower-level devices on all nodes |
| must be attached, and all nodes must be connected. If the size is |
| specified explicitly, this is not necessary. The <option>size</option> |
| value is assumed to be in units of sectors (512 bytes) by |
| default.</para> |
| |
| <!-- FIXME: |
| The <option>- - size</option> option should only be used if you wish not |
| to use as much as possible from the backing block devices. If you do |
| not use <option>-d</option>, the <replaceable>device</replaceable> is |
| only ready for use as soon as it was connected to its peer once. |
| --> |
| |
| <!-- |
| <para>If you use the <replaceable>size</replaceable> parameter in |
| drbd.conf, we strongly recommend to add an explicit unit postfix. |
| drbdadm and drbdsetup used to have mismatching default units.</para> |
| --> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="dialog-refresh"> |
| <term xml:id="dialog-refresh"><option>dialog-refresh <replaceable>time</replaceable></option></term> |
| |
| <definition> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>dialog-refresh</secondary> |
| </indexterm> The DRBD init script can be used to configure and start |
| DRBD devices, which can involve waiting for other cluster nodes. |
| While waiting, the init script shows the remaining waiting time. The |
| <option>dialog-refresh</option> defines the number of seconds between |
| updates of that countdown. The default value is 1; a value of 0 turns |
| off the countdown.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="disable-ip-verification"> |
| <term xml:id="disable-ip-verification"><option>disable-ip-verification</option></term> |
| |
| <definition> |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>disable-ip-verification</secondary> |
| </indexterm> |
| |
| <para> |
| Normally, DRBD verifies that the IP addresses in the configuration |
| match the host names. Use the <option>disable-ip-verification</option> |
| parameter to disable these checks. |
| </para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="usage-count"> |
| <term xml:id="usage-count"><option>usage-count |
| <group choice="req" rep="norepeat"> |
| <arg choice="plain" rep="norepeat">yes</arg> |
| <arg choice="plain" rep="norepeat">no</arg> |
| <arg choice="plain" rep="norepeat">ask</arg> |
| </group> |
| </option></term> |
| |
| <definition> |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>usage-count</secondary> |
| </indexterm> |
| |
| <para>A explained on DRBD's <ulink url="http://usage.drbd.org"><citetitle> |
| Online Usage Counter</citetitle></ulink> web page, DRBD includes a |
| mechanism for anonymously counting how many installations are using which |
| versions of DRBD. The results are available on the web page for anyone to |
| see.</para> |
| |
| <para>This parameter defines if a cluster node participates in the usage |
| counter; the supported values are <option>yes</option>, |
| <option>no</option>, and <option>ask</option> (ask the user, the |
| default).</para> |
| |
| <para>We would like to ask users to participate in the online usage |
| counter as this provides us valuable feedback for steering the |
| development of DRBD.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="udev-always-use-vnr"> |
| <term xml:id="udev-always-use-vnr"><option>udev-always-use-vnr</option></term> |
| |
| <definition> |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>udev-always-use-vnr</secondary> |
| </indexterm> |
| |
| <para>When udev asks drbdadm for a list of device related symlinks, |
| drbdadm would suggest symlinks with differing naming conventions, |
| depending on whether the resource has explicit |
| <literal>volume VNR { }</literal> definitions, |
| or only one single volume with the implicit volume number 0: |
| <programlisting><![CDATA[ |
| # implicit single volume without "volume 0 {}" block |
| DEVICE=drbd<minor> |
| SYMLINK_BY_RES=drbd/by-res/<resource-name> |
| SYMLINK_BY_DISK=drbd/by-disk/<backing-disk-name> |
| |
| # explicit volume definition: volume VNR { } |
| DEVICE=drbd<minor> |
| SYMLINK_BY_RES=drbd/by-res/<resource-name>/VNR |
| SYMLINK_BY_DISK=drbd/by-disk/<backing-disk-name> |
| ]]></programlisting> |
| </para> |
| |
| <para>If you define this parameter in the global section, |
| drbdadm will always add the <literal>.../VNR</literal> part, |
| and will not care for whether the volume definition was implicit or explicit. |
| </para> |
| |
| <para>For legacy backward compatibility, this is off by default, |
| but we do recommend to enable it.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="after-sb-0pri"> |
| <term xml:id="after-sb-0pri"><option>after-sb-0pri <replaceable>policy</replaceable></option></term> |
| |
| <definition> |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>after-sb-0pri</secondary> |
| </indexterm> |
| |
| <para>Define how to react if a split-brain scenario is detected and none |
| of the two nodes is in primary role. (We detect split-brain scenarios |
| when two nodes connect; split-brain decisions are always between two |
| nodes.) The defined policies are:</para> |
| |
| <variablelist> |
| <varlistentry> |
| <term><option>disconnect</option></term> |
| |
| <listitem> |
| <para>No automatic resynchronization; simply disconnect.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><option>discard-younger-primary</option></term> |
| <term><option>discard-older-primary</option></term> |
| |
| <listitem> |
| <para>Resynchronize from the node which became primary first |
| (<option>discard-younger-primary</option>) or last |
| (<option>discard-older-primary</option>). If both nodes became |
| primary independently, the <option>discard-least-changes</option> |
| policy is used.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><option>discard-zero-changes</option></term> |
| |
| <listitem> |
| <para>If only one of the nodes wrote data since the split brain |
| situation was detected, resynchronize from this node to the other. |
| If both nodes wrote data, disconnect.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><option>discard-least-changes</option></term> |
| |
| <listitem> |
| <para>Resynchronize from the node with more modified blocks.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><option>discard-node-<replaceable>nodename</replaceable></option></term> |
| |
| <listitem> |
| <para>Always resynchronize to the named node.</para> |
| </listitem> |
| </varlistentry> |
| </variablelist> |
| <!-- FIXME: Refer to rr-conflict. --> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="after-sb-1pri"> |
| <term xml:id="after-sb-1pri"><option>after-sb-1pri <replaceable>policy</replaceable></option></term> |
| |
| <definition> |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>after-sb-1pri</secondary> |
| </indexterm> |
| |
| <para>Define how to react if a split-brain scenario is detected, with one |
| node in primary role and one node in secondary role. (We detect |
| split-brain scenarios when two nodes connect, so split-brain decisions |
| are always among two nodes.) The defined policies are:</para> |
| |
| <variablelist> |
| <varlistentry> |
| <term><option>disconnect</option></term> |
| |
| <listitem> |
| <para>No automatic resynchronization, simply disconnect.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><option>consensus</option></term> |
| |
| <listitem> |
| <para>Discard the data on the secondary node if the |
| <option>after-sb-0pri</option> algorithm would also discard the |
| data on the secondary node. Otherwise, disconnect.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><option>violently-as0p</option></term> |
| |
| <listitem> |
| <para>Always take the decision of the <option>after-sb-0pri</option> algorithm, |
| even if it causes an erratic change of the primary's view of the |
| data. This is only useful if a single-node file system (i.e., not |
| OCFS2 or GFS) with the <option>allow-two-primaries</option> flag |
| is used. This option can cause the primary node to crash, and |
| should not be used.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="discard-secondary"><option>discard-secondary</option></term> |
| |
| <listitem> |
| <para>Discard the data on the secondary node.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="call-pri-lost-after-sb"><option>call-pri-lost-after-sb</option></term> |
| |
| <listitem> |
| <para>Always take the decision of the |
| <option>after-sb-0pri</option> algorithm. If the decision is to |
| discard the data on the primary node, call the |
| <option xml:id="pri-lost-after-sb">pri-lost-after-sb</option> handler on the primary |
| node.</para> |
| </listitem> |
| </varlistentry> |
| </variablelist> |
| <!-- FIXME: Refer to rr-conflict. --> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="after-sb-2pri"> |
| <term xml:id="after-sb-2pri"><option>after-sb-2pri <replaceable>policy</replaceable></option></term> |
| |
| <definition> |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>after-sb-2pri</secondary> |
| </indexterm> |
| |
| <para>Define how to react if a split-brain scenario is detected and both |
| nodes are in primary role. (We detect split-brain scenarios when two |
| nodes connect, so split-brain decisions are always among two nodes.) The |
| defined policies are:</para> |
| |
| <variablelist> |
| <varlistentry> |
| <term><option>disconnect</option></term> |
| |
| <listitem> |
| <para>No automatic resynchronization, simply disconnect.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="violently-as0p"><option>violently-as0p</option></term> |
| |
| <listitem> |
| <para>See the <option>violently-as0p</option> policy for |
| <option>after-sb-1pri</option>.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><option>call-pri-lost-after-sb</option></term> |
| |
| <listitem> |
| <para>Call the <option>pri-lost-after-sb</option> helper program on one |
| of the machines unless that machine can demote to secondary. The helper |
| program is expected to reboot the machine, which brings the node into |
| a secondary role. Which machine runs the helper program is determined |
| by the <option>after-sb-0pri</option> strategy.</para> |
| </listitem> |
| </varlistentry> |
| </variablelist> |
| <!-- FIXME: Refer to rr-conflict. --> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="allow-two-primaries"> |
| <term xml:id="allow-two-primaries"><option>allow-two-primaries</option></term> |
| |
| <definition> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>allow-two-primaries</secondary> |
| </indexterm> The most common way to configure DRBD devices is to allow |
| only one node to be primary (and thus writable) at a time.</para> |
| |
| <para>In some scenarios it is preferable to allow two nodes to be |
| primary at once; a mechanism outside of DRBD then must make sure that |
| writes to the shared, replicated device happen in a coordinated way. |
| This can be done with a shared-storage cluster file system like OCFS2 |
| and GFS, or with virtual machine images and a virtual machine manager |
| that can migrate virtual machines between physical machines.</para> |
| |
| <para>The <option>allow-two-primaries</option> parameter tells DRBD to |
| allow two nodes to be primary at the same time. Never enable this |
| option when using a non-distributed file system; otherwise, data |
| corruption and node crashes will result!</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="always-asbp"> |
| <term xml:id="always-asbp"><option>always-asbp</option></term> |
| <!-- FIXME: this option does not mke any sense anymore. How can we fix this? --> |
| <definition> |
| <para>Normally the automatic after-split-brain policies are only used if current |
| states of the UUIDs do not indicate the presence of a third node.</para> |
| |
| <para>With this option you request that the automatic after-split-brain policies are |
| used as long as the data sets of the nodes are somehow related. This might cause a |
| full sync, if the UUIDs indicate the presence of a third node. (Or double faults led |
| to strange UUID sets.)</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="connect-int"> |
| <term xml:id="connect-int"><option>connect-int <replaceable>time</replaceable></option></term> |
| |
| <definition> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>connect-int</secondary> |
| </indexterm> As soon as a connection between two nodes is configured |
| with <command moreinfo="none">drbdsetup connect</command>, DRBD |
| immediately tries to establish the connection. If this fails, DRBD |
| waits for <option>connect-int</option> seconds and then repeats. The |
| default value of <option>connect-int</option> is 10 seconds.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="cram-hmac-alg"> |
| <term xml:id="cram-hmac-alg"><option>cram-hmac-alg <replaceable>hash-algorithm</replaceable></option></term> |
| |
| <definition> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>cram-hmac-alg</secondary> |
| </indexterm> Configure the hash-based message authentication code |
| (HMAC) or secure hash algorithm to use for peer authentication. The |
| kernel supports a number of different algorithms, some of which may be |
| loadable as kernel modules. See the shash algorithms listed in |
| /proc/crypto. By default, <option>cram-hmac-alg</option> is unset. |
| Peer authentication also requires a <option>shared-secret</option> to |
| be configured.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="csums-alg"> |
| <term xml:id="csum-alg"><option>csums-alg <replaceable>hash-algorithm</replaceable></option></term> |
| |
| <definition> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>csums-alg</secondary> |
| </indexterm> Normally, when two nodes resynchronize, the sync target |
| requests a piece of out-of-sync data from the sync source, and the sync |
| source sends the data. With many usage patterns, a significant number of those blocks |
| will actually be identical.</para> |
| |
| <para>When a <option>csums-alg</option> algorithm is specified, when |
| requesting a piece of out-of-sync data, the sync target also sends |
| along a hash of the data it currently has. The sync source compares |
| this hash with its own version of the data. It sends the sync target |
| the new data if the hashes differ, and tells it that the data are the |
| same otherwise. This reduces the network bandwidth required, at the |
| cost of higher cpu utilization and possibly increased I/O on the sync |
| target.</para> |
| |
| <para>The <option>csums-alg</option> can be set to one of the secure |
| hash algorithms supported by the kernel; see the shash algorithms |
| listed in /proc/crypto. By default, <option>csums-alg</option> is |
| unset.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="csums-after-crash-only"> |
| <term xml:id="csums-after-crash-only"><option>csums-after-crash-only</option></term> |
| |
| <definition> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>csums-after-crash-only</secondary> |
| </indexterm> Enabling this option (and csums-alg, above) makes it possible to |
| use the checksum based resync only for the first resync after primary crash, |
| but not for later "network hickups".</para> |
| <para>In most cases, block that are marked as need-to-be-resynced are in fact changed, |
| so calculating checksums, and both reading and writing the blocks on the resync target |
| is all effective overhead.</para> |
| <para>The advantage of checksum based resync is mostly after primary crash recovery, |
| where the recovery marked larger areas (those covered by the activity log) |
| as need-to-be-resynced, just in case. Introduced in 8.4.5.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="data-integrity-alg"> |
| <term xml:id="data-integrity-alg"><option>data-integrity-alg </option> <replaceable>alg</replaceable></term> |
| |
| <definition> |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>data-integrity-alg</secondary> |
| </indexterm> |
| |
| <para>DRBD normally relies on the data integrity checks built into the |
| TCP/IP protocol, but if a data integrity algorithm is configured, it will |
| additionally use this algorithm to make sure that the data received over |
| the network match what the sender has sent. If a data integrity error is |
| detected, DRBD will close the network connection and reconnect, which |
| will trigger a resync.</para> |
| |
| <para>The <option>data-integrity-alg</option> can be set to one of the |
| secure hash algorithms supported by the kernel; see the shash algorithms |
| listed in /proc/crypto. By default, this mechanism is turned off.</para> |
| |
| <para>Because of the CPU overhead involved, we recommend not to use this |
| option in production environments. Also see the notes on data |
| integrity below.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="ko-count"> |
| <term xml:id="ko-count"><option>ko-count <replaceable>number</replaceable></option></term> |
| |
| <definition> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>ko-count</secondary> |
| </indexterm> If a secondary node fails to complete a write request in |
| <option>ko-count</option> times the <option>timeout</option> parameter, |
| it is excluded from the cluster. The primary node then sets the |
| connection to this secondary node to Standalone. |
| To disable this feature, you should explicitly set it to 0; defaults may change between versions. |
| </para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="max-buffers"> |
| <term xml:id="max-buffers"><option>max-buffers <replaceable>number</replaceable></option></term> |
| |
| <definition> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>max-buffers</secondary> |
| </indexterm> Limits the memory usage per DRBD minor device on the receiving side, |
| or for internal buffers during resync or online-verify. |
| Unit is PAGE_SIZE, which is 4 KiB on most systems. |
| The minimum possible setting is hard coded to 32 (=128 KiB). |
| These buffers are used to hold data blocks while they are written to/read from disk. |
| To avoid possible distributed deadlocks on congestion, this setting is used |
| as a throttle threshold rather than a hard limit. Once more than max-buffers |
| pages are in use, further allocation from this pool is throttled. |
| You want to increase max-buffers if you cannot saturate the IO backend on the |
| receiving side.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="max-epoch-size"> |
| <term xml:id="max-epoch-size"><option>max-epoch-size <replaceable>number</replaceable></option></term> |
| |
| <definition> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>max-epoch-size</secondary> |
| </indexterm> Define the maximum number of write requests DRBD may issue |
| before issuing a write barrier. The default value is 2048, with a |
| minimum of 1 and a maximum of 20000. Setting this parameter to a value |
| below 10 is likely to decrease performance.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="on-congestion"> |
| <term xml:id="on-congestion"><option>on-congestion <replaceable>policy</replaceable></option></term> |
| |
| <term xml:id="congestion-fill"><option>congestion-fill <replaceable>threshold</replaceable></option></term> |
| |
| <term xml:id="congestion-extents"><option>congestion-extents |
| <replaceable>threshold</replaceable></option></term> |
| |
| <definition> |
| <para>By default, DRBD blocks when the TCP send queue is full. This prevents |
| applications from generating further write requests until more buffer |
| space becomes available again.</para> |
| |
| <para>When DRBD is used together with DRBD-proxy, it can be better to use |
| the <option>pull-ahead</option> <option>on-congestion</option> policy, |
| which can switch DRBD into ahead/behind mode before the send queue is full. |
| DRBD then records the differences between itself and the peer in its |
| bitmap, but it no longer replicates them to the peer. When enough buffer |
| space becomes available again, the node resynchronizes with the peer and |
| switches back to normal replication.</para> |
| |
| <para>This has the advantage of not blocking application I/O even when the |
| queues fill up, and the disadvantage that peer nodes can fall behind much |
| further. Also, while resynchronizing, peer nodes will become |
| inconsistent.</para> |
| |
| <para>The available congestion policies are <option>block</option> (the |
| default) and <option>pull-ahead</option>. The |
| <option>congestion-fill</option> parameter defines how much data is |
| allowed to be "in flight" in this connection. The default value is 0, |
| which disables this mechanism of congestion control, with a maximum of |
| 10 GiBytes. The <option>congestion-extents</option> parameter defines |
| how many bitmap extents may be active before switching into ahead/behind |
| mode, with the same default and limits as the <option>al-extents</option> |
| parameter. The <option>congestion-extents</option> parameter is |
| effective only when set to a value smaller than |
| <option>al-extents</option>.</para> |
| |
| <para>Ahead/behind mode is available since DRBD 8.3.10.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="ping-int"> |
| <term xml:id="ping-int"><option>ping-int <replaceable>interval</replaceable></option></term> |
| |
| <definition> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>ping-int</secondary> |
| </indexterm> When the TCP/IP connection to a peer is idle for more than |
| <option>ping-int</option> seconds, DRBD will send a keep-alive packet |
| to make sure that a failed peer or network connection is detected |
| reasonably soon. The default value is 10 seconds, with a minimum of 1 |
| and a maximum of 120 seconds. The unit is seconds.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="ping-timeout"> |
| <term xml:id="ping-timeout"><option>ping-timeout <replaceable>timeout</replaceable></option></term> |
| |
| <definition> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>ping-timeout</secondary> |
| </indexterm> Define the timeout for replies to keep-alive packets. If |
| the peer does not reply within <option>ping-timeout</option>, DRBD will |
| close and try to reestablish the connection. The default value is 0.5 |
| seconds, with a minimum of 0.1 seconds and a maximum of 3 seconds. The |
| unit is tenths of a second.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="socket-check-timeout"> |
| <term xml:id="socket-check-timeout"><option>socket-check-timeout <replaceable>timeout</replaceable></option></term> |
| |
| <definition> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>socket-check-timeout</secondary> |
| </indexterm>In setups involving a DRBD-proxy and connections that experience a lot of |
| buffer-bloat it might be necessary to set <option>ping-timeout</option> to an |
| unusual high value. By default DRBD uses the same value to wait if a newly |
| established TCP-connection is stable. Since the DRBD-proxy is usually located |
| in the same data center such a long wait time may hinder DRBD's connect process.</para> |
| <para>In such setups <option>socket-check-timeout</option> should be set to |
| at least to the round trip time between DRBD and DRBD-proxy. I.e. in most |
| cases to 1.</para> |
| <para>The default unit is tenths of a second, the default value is 0 (which causes |
| DRBD to use the value of <option>ping-timeout</option> instead). |
| Introduced in 8.4.5.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="protocol"> |
| <term xml:id="protocol"><option>protocol <replaceable>name</replaceable></option></term> |
| |
| <definition> |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>protocol</secondary> |
| </indexterm> |
| |
| <para>Use the specified protocol on this connection. The supported |
| protocols are: |
| <variablelist> |
| <varlistentry> |
| <term xml:id="A"><option>A</option></term> |
| |
| <listitem> |
| <para>Writes to the DRBD device complete as soon as they have |
| reached the local disk and the TCP/IP send buffer.</para> |
| </listitem> |
| </varlistentry> |
| <varlistentry> |
| <term xml:id="B"><option>B</option></term> |
| |
| <listitem> |
| <para>Writes to the DRBD device complete as soon as they have |
| reached the local disk, and all peers have acknowledged the |
| receipt of the write requests.</para> |
| </listitem> |
| </varlistentry> |
| <varlistentry> |
| <term xml:id="C"><option>C</option></term> |
| |
| <listitem> |
| <para>Writes to the DRBD device complete as soon as they have |
| reached the local and all remote disks.</para> |
| </listitem> |
| </varlistentry> |
| </variablelist> |
| </para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="rcvbuf-size"> |
| <term xml:id="rcvbuf-size"><option>rcvbuf-size <replaceable>size</replaceable></option></term> |
| |
| <definition> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>rcvbuf-size</secondary> |
| </indexterm> Configure the size of the TCP/IP receive buffer. A value |
| of 0 (the default) causes the buffer size to adjust dynamically. |
| This parameter usually does not need to be set, but it can be set |
| to a value up to 10 MiB. The default unit is bytes.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="rr-conflict"> |
| <term xml:id="rr-conflict"><option>rr-conflict</option> <replaceable>policy</replaceable></term> |
| |
| <definition> |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>rr-conflict</secondary> |
| </indexterm> |
| |
| <para>This option helps to solve the cases when the outcome of the resync decision is |
| incompatible with the current role assignment in the cluster. The |
| defined policies are:</para> |
| |
| <variablelist> |
| <varlistentry> |
| <term xml:id="disconnect"><option>disconnect</option></term> |
| |
| <listitem> |
| <para>No automatic resynchronization, simply disconnect.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="violently"><option>violently</option></term> |
| |
| <listitem> |
| <para>Resync to the primary node is allowed, violating the assumption that data on |
| a block device are stable for one of the nodes. <emphasis>Do not |
| use this option, it is dangerous.</emphasis></para> <!-- What would happen? --> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term xml:id="call-pri-lost"><option>call-pri-lost</option></term> |
| |
| <listitem> |
| <para>Call the <option>pri-lost</option> handler on one of the machines. The handler is |
| expected to reboot the machine, which puts it into secondary role.</para> |
| </listitem> |
| </varlistentry> |
| </variablelist> |
| <!-- FIXME: It is completely unclear how this option interacts with |
| after-sb-0pri, after-sb-1pri, and after-sb-2pri. --> |
| <!-- FIXME: Refer to after-sb-0pri, after-sb-1pri, and after-sb-2pri. --> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="shared-secret"> |
| <term xml:id="shared-secret"><option>shared-secret <replaceable>secret</replaceable></option></term> |
| |
| <definition> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>shared-secret</secondary> |
| </indexterm> Configure the shared secret used for peer authentication. |
| The secret is a string of up to 64 characters. Peer authentication also |
| requires the <option>cram-hmac-alg</option> parameter to be set.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="sndbuf-size"> |
| <term xml:id="sndbuf-size"><option>sndbuf-size <replaceable>size</replaceable></option></term> |
| |
| <definition> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>sndbuf-size</secondary> |
| </indexterm> Configure the size of the TCP/IP send buffer. Since DRBD |
| 8.0.13 / 8.2.7, a value of 0 (the default) causes the buffer size to |
| adjust dynamically. Values below 32 KiB are harmful to the throughput |
| on this connection. Large buffer sizes can be useful especially when |
| protocol A is used over high-latency networks; the maximum value |
| supported is 10 MiB.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="tcp-cork"> |
| <term xml:id="tcp-cork"><option>tcp-cork</option></term> |
| |
| <definition> |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>tcp-cork</secondary> |
| </indexterm> |
| |
| <para>By default, DRBD uses the TCP_CORK socket option to prevent the |
| kernel from sending partial messages; this results in fewer and bigger |
| packets on the network. Some network stacks can perform worse with this |
| optimization. On these, the <option>tcp-cork</option> parameter can be |
| used to turn this optimization off.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="timeout"> |
| <term xml:id="timeout"><option>timeout <replaceable>time</replaceable></option></term> |
| |
| <definition> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>timeout</secondary> |
| </indexterm> Define the timeout for replies over the network: if a peer |
| node does not send an expected reply within the specified <option>timeout</option>, |
| it is considered dead and the TCP/IP connection is closed. The timeout |
| value must be lower than <option>connect-int</option> and lower than |
| <option>ping-int</option>. The default is 6 seconds; the value is |
| specified in tenths of a second.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="use-rle"> |
| <term xml:id="use-rle"><option>use-rle</option></term> |
| |
| <definition> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>use-rle</secondary> |
| </indexterm> Each replicated device on a cluster node has a separate |
| bitmap for each of its peer devices. The bitmaps are used for tracking |
| the differences between the local and peer device: depending on the |
| cluster state, a disk range can be marked as different from the peer in |
| the device's bitmap, in the peer device's bitmap, or in both bitmaps. |
| When two cluster nodes connect, they exchange each other's bitmaps, and |
| they each compute the union of the local and peer bitmap to determine |
| the overall differences.</para> |
| |
| <para>Bitmaps of very large devices are also relatively large, but they |
| usually compress very well using run-length encoding. This can save |
| time and bandwidth for the bitmap transfers.</para> |
| |
| <para>The <option>use-rle</option> parameter determines if run-length |
| encoding should be used. It is on by default since DRBD 8.4.0.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="verify-alg"> |
| <term xml:id="verify-alg"><option>verify-alg <replaceable>hash-algorithm</replaceable></option></term> |
| |
| <definition> |
| <para>Online verification (<command moreinfo="none">drbdadm |
| verify</command>) computes and compares checksums of disk blocks |
| (i.e., hash values) in order to detect if they differ. The |
| <option>verify-alg</option> parameter determines which algorithm to use |
| for these checksums. It must be set to one of the secure hash algorithms |
| supported by the kernel before online verify can be used; see the shash |
| algorithms listed in /proc/crypto.</para> |
| |
| <para>We recommend to schedule online verifications regularly during |
| low-load periods, for example once a month. Also see the notes on data |
| integrity below.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="discard-my-data"> |
| <term xml:id="discard-my-data"><option>discard-my-data</option></term> |
| |
| <definition> |
| <para>Discard the local data and resynchronize with the peer that has the |
| most up-to-data data. Use this option to manually recover from a |
| split-brain situation.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="tentative"> |
| <term xml:id="tentative"><option>tentative</option></term> |
| |
| <definition> |
| <para>Only determine if a connection to the peer can be established and |
| if a resync is necessary (and in which direction) without actually |
| establishing the connection or starting the resync. Check the system |
| log to see what DRBD would do without the <option>--tentative</option> |
| option.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="auto-promote"> |
| <term xml:id="auto-promote"><option>auto-promote <replaceable>bool-value</replaceable></option></term> |
| |
| <definition> |
| <indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>auto-promote</secondary> |
| </indexterm> |
| |
| <para>A resource must be promoted to primary role before any of its devices |
| can be mounted or opened for writing.</para> |
| |
| <para>Before DRBD 9, this could only be done explicitly ("drbdadm |
| primary"). Since DRBD 9, the <option>auto-promote</option> parameter |
| allows to automatically promote a resource to primary role when one of |
| its devices is mounted or opened for writing. As soon as all devices are |
| unmounted or closed with no more remaining users, the role of the |
| resource changes back to secondary.</para> |
| |
| <para>Automatic promotion only succeeds if the cluster state allows it |
| (that is, if an explicit <command moreinfo="none">drbdadm |
| primary</command> command would succeed). Otherwise, mounting or |
| opening the device fails as it already did before DRBD 9: the |
| <citerefentry><refentrytitle>mount</refentrytitle><manvolnum>2</manvolnum></citerefentry> |
| system call fails with errno set to EROFS (Read-only file system); the |
| <citerefentry><refentrytitle>open</refentrytitle><manvolnum>2</manvolnum></citerefentry> |
| system call fails with errno set to EMEDIUMTYPE (wrong medium |
| type).</para> |
| |
| <para>Irrespective of the <option>auto-promote</option> parameter, if a |
| device is promoted explicitly (<command moreinfo="none">drbdadm |
| primary</command>), it also needs to be demoted explicitly (<command |
| moreinfo="none">drbdadm secondary</command>).</para> |
| |
| <para>The <option>auto-promote</option> parameter is available since DRBD |
| 9.0.0, and defaults to <constant>yes</constant>.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="cpu-mask"> |
| <term xml:id="cpu-mask"><option>cpu-mask <replaceable>cpu-mask</replaceable></option></term> |
| |
| <definition> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>cpu-mask</secondary> |
| </indexterm> Set the cpu affinity mask for DRBD kernel threads. The |
| cpu mask is specified as a hexadecimal number. The default value is 0, |
| which lets the scheduler decide which kernel threads run on which CPUs. |
| CPU numbers in <option>cpu-mask</option> which do not exist in the |
| system are ignored.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="on-no-data-accessible"> |
| <term xml:id="on-no-data-accessible"><option>on-no-data-accessible |
| <replaceable>policy</replaceable></option></term> |
| |
| <definition> |
| <para>Determine how to deal with I/O requests when the requested data is |
| not available locally or remotely (for example, when all disks have |
| failed). The defined policies are: |
| |
| <variablelist> |
| <varlistentry> |
| <term xml:id="io-error"><option>io-error</option></term> |
| <listitem><para> |
| System calls fail with errno set to EIO. |
| </para></listitem> |
| </varlistentry> |
| <varlistentry> |
| <term xml:id="suspend-io"><option>suspend-io</option></term> |
| <listitem><para> |
| The resource suspends I/O. I/O can be resumed by (re)attaching |
| the lower-level device, by connecting to a peer which has |
| access to the data, or by forcing DRBD to resume I/O with |
| <command moreinfo="none">drbdadm resume-io |
| <replaceable>res</replaceable></command>. When no data is |
| available, forcing I/O to resume will result in the same |
| behavior as the <option>io-error</option> policy. |
| </para></listitem> |
| </varlistentry> |
| </variablelist> |
| |
| This setting is available since DRBD 8.3.9; the default policy is |
| <option>io-error</option>. </para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="peer-ack-window"> |
| <term xml:id="peer-ack-window"><option>peer-ack-window <replaceable>value</replaceable></option></term> |
| |
| <definition> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>peer-ack-window</secondary> |
| </indexterm> |
| On each node and for each device, DRBD maintains a bitmap of the |
| differences between the local and remote data for each peer device. |
| For example, in a three-node setup (nodes A, B, C) each with a single |
| device, every node maintains one bitmap for each of its peers.</para> |
| |
| <para>When nodes receive write requests, they know how to update the |
| bitmaps for the writing node, but not how to update the bitmaps between |
| themselves. In this example, when a write request propagates from node |
| A to B and C, nodes B and C know that they have the same data as node |
| A, but not whether or not they both have the same data.</para> |
| |
| <para>As a remedy, the writing node occasionally sends peer-ack packets |
| to its peers which tell them which state they are in relative to each |
| other.</para> |
| |
| <para>The <option>peer-ack-window</option> parameter specifies how much |
| data a primary node may send before sending a peer-ack packet. A low |
| value causes increased network traffic; a high value causes less |
| network traffic but higher memory consumption on secondary nodes and |
| higher resync times between the secondary nodes after primary node |
| failures. (Note: peer-ack packets may be sent due to other reasons as |
| well, e.g. membership changes or expiry of the |
| <option>peer-ack-delay</option> timer.)</para> |
| |
| <para>The default value for <option>peer-ack-window</option> is 2 MiB, |
| the default unit is sectors. This option is available since |
| 9.0.0.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="peer-ack-delay"> |
| <term xml:id="peer-ack-delay"><option>peer-ack-delay <replaceable>expiry-time</replaceable></option></term> |
| |
| <definition> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>peer-ack-delay</secondary> |
| </indexterm> |
| If after the last finished write request no new write request gets issued for |
| <replaceable>expiry-time</replaceable>, then a peer-ack packet is sent. |
| If a new write request is issued before the timer expires, the timer gets reset |
| to <replaceable>expiry-time</replaceable>. (Note: peer-ack packets may be sent |
| due to other reasons as well, e.g. membership changes or the |
| <option>peer-ack-window</option> option.)</para> |
| <para>This parameter may influence resync behavior on remote nodes. Peer nodes |
| need to wait until they receive an peer-ack for releasing a lock on an AL-extent. |
| Resync operations between peers may need to wait for for these locks. |
| </para> |
| <para>The default value for <option>peer-ack-delay</option> is 100 milliseconds, |
| the default unit is milliseconds. This option is available since |
| 9.0.0.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="degr-wfc-timeout"> |
| <term xml:id="degr-wfc-timeout"><option>degr-wfc-timeout <replaceable>timeout</replaceable></option></term> |
| |
| <definition> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>degr-wfc-timeout</secondary> |
| </indexterm> Define how long to wait until all peers are |
| connected in case the cluster consisted of a single node only |
| when the system went down. This parameter is usually set to a |
| value smaller than <option>wfc-timeout</option>. The |
| assumption here is that peers which were unreachable before a |
| reboot are less likely to be be reachable after the reboot, so |
| waiting is less likely to help.</para> |
| |
| <para>The timeout is specified in seconds. The default value is 0, |
| which stands for an infinite timeout. Also see the |
| <option>wfc-timeout</option> parameter.</para> |
| <!-- FIXME: How does wfc-timeout vs. degr-wfc-timeout work with |
| more than two nodes in the cluster? If a cluster is only |
| "degraded" when only one node remains and only one out of |
| three nodes fails, we will still wait for that one node for |
| wfc-timeout, which might be forever. --> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="outdated-wfc-timeout"> |
| <term xml:id="outdated-wfc-timeout"><option>outdated-wfc-timeout <replaceable>timeout</replaceable></option></term> |
| |
| <definition> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>outdated-wfc-timeout</secondary> |
| </indexterm> Define how long to wait until all peers are |
| connected if all peers were outdated when the system went down. |
| This parameter is usually set to a value smaller than |
| <option>wfc-timeout</option>. The assumption here is that an |
| outdated peer cannot have become primary in the meantime, so we |
| don't need to wait for it as long as for a node which was alive |
| before.</para> |
| |
| <para>The timeout is specified in seconds. The default value is 0, |
| which stands for an infinite timeout. Also see the |
| <option>wfc-timeout</option> parameter.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="wait-after-sb"> |
| <term xml:id="wait-after-sb"><option>wait-after-sb</option></term> |
| |
| <definition> |
| <para>This parameter causes DRBD to continue waiting in the init |
| script even when a split-brain situation has been detected, and |
| the nodes therefore refuse to connect to each other.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="wfc-timeout"> |
| <term xml:id="wfc-timeout"><option>wfc-timeout <replaceable>timeout</replaceable></option></term> |
| |
| <definition> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>wfc-timeout</secondary> |
| </indexterm> Define how long the init script waits until all peers are |
| connected. This can be useful in combination with a cluster manager |
| which cannot manage DRBD resources: when the cluster manager starts, |
| the DRBD resources will already be up and running. With a more capable |
| cluster manager such as Pacemaker, it makes more sense to let the |
| cluster manager control DRBD resources. The timeout is specified in |
| seconds. The default value is 0, which stands for an infinite timeout. |
| Also see the <option>degr-wfc-timeout</option> parameter.</para> |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="quorum"> |
| <term xml:id="quorum"><option>quorum <replaceable>value</replaceable></option> |
| </term> |
| |
| <definition> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>quorum</secondary> |
| </indexterm> When activated, a cluster partition requires quorum |
| in order to modify the replicated data set. That means a node in |
| the cluster partition can only be promoted to primary if the |
| cluster partition has quorum. |
| If a primary node should execute a write request, but the |
| cluster partition has lost quorum, it will freeze IO or reject |
| the write request with an error (depending on the |
| <option>on-no-quorum</option> setting). Upon loosing quorum a primary |
| always invokes the <option>quorum-lost</option> handler. The handler is |
| intended for notification purposes, its return code is ignored.</para> |
| |
| <para>The option's value might be set to <option>off</option>, |
| <option>majority</option>, <option>all</option> or a numeric value. If you |
| set it to a numeric value, make sure that the value is greater then half |
| of your number of nodes. |
| Quorum is a mechanism to avoid data divergence, it might be used instead |
| of fencing when there are more than two repicas. It defaults to |
| <option>off</option></para> |
| |
| <para>If all missing nodes are marked as outdated, a partition always has |
| quorum, no matter how small it is. I.e. If you disconnect all secondary |
| nodes gracefully a single primary continues to operate. In the moment a |
| single secondary is lost, it has to be assumed that it forms a partition |
| with all the missing outdated nodes. In case my partition might |
| be smaller than the other, quorum is lost in this moment.</para> |
| |
| <para>The quorum implementation is available starting with the DRBD kernel |
| driver version 9.0.7.</para> |
| |
| </definition> |
| </drbdsetup_option> |
| |
| <drbdsetup_option name="on-no-quorum"> |
| <term xml:id="on-no-quorum"><option>on-no-quorum <group choice="req" rep="norepeat"> |
| <arg choice="plain" rep="norepeat">io-error</arg> |
| <arg choice="plain" rep="norepeat">suspend-io</arg> |
| </group> |
| </option></term> |
| |
| <definition> |
| <para><indexterm significance="normal"> |
| <primary>drbd.conf</primary> |
| |
| <secondary>quorum</secondary> |
| </indexterm> By default DRBD freezes IO on a device, that lost quorum. |
| By setting the <option>on-no-quorum</option> to <option>io-error</option> it |
| completes all IO operations with an error if quorum ist lost.</para> |
| |
| <para>The <option>on-no-quorum</option> options is available starting with the DRBD kernel |
| driver version 9.0.8.</para> |
| </definition> |
| |
| </drbdsetup_option> |
| </drbdsetup_options> |