[pve-devel] [PATCH v2 cluster 1/5] corosync: add verify_conf

Fabian Grünbichler f.gruenbichler at proxmox.com
Mon Dec 16 18:09:45 CET 2019


On December 16, 2019 3:58 pm, Stefan Reiter wrote:
> On 12/12/19 4:21 PM, Thomas Lamprecht wrote:
>> On 12/4/19 3:04 PM, Stefan Reiter wrote:
>>> It does some basic sanity checking, warns the user about encryption
>>> settings and unresolved hostnames, and finally makes sure that all nodes
>>> have the same links configured (as well as comparing the configured
>>> links to specified interfaces, if there are any).
>>>
>>> A corosync.conf that has been created and modified strictly through our
>>> API should *always* be valid.
>> 
>> True, but you also need to stay backward compatible with what our API
>> produced since 4.0 and what was allowed to be upgraded to 6.0.
>> Did you check if the possibilities then, with the pve-upgrade checker
>> were restrictive enough as you're here?
>> 
> 
> That's the reason for the
> 
> if (%$interfaces) {
>      ...
> } else {
>      ...
> }
> 
> block. New versions always have a totem->interface block, but old 
> versions omitted it. Old configs are checked differently, but will still 
> be valid (as long as the links are consistent between nodes).
> 
>> Also, I'd still like to avoid error-ing if the configuration could worked,
>> but was a bit garbled up.. This is then still seen as regression for a
>> lot, and pitch forks may be lit and raised, so I'd like to try hard to
>> not break those setups, even if we can argue that it's not directly
>> from our API. If it's at least somewhat reasonable it should not be
>> an error. (did not checked all checks to closely, so you may already be
>> flexible enough, so this is more asking if you checked :) )
>> 
> 
> verify_conf only "errors" under the following conditions:
> 
> * no nodes found (corosync won't run, obviously)
> * no totem found (corosync won't run, and currently our tools don't 
> really handle this well either)
> * links are mismatched between nodes or nodes and totem
> 
> The last one is the case that initially inspired the function. 
> Technically, corosync does run with mismatched links (e.g. [0]), but 
> throws some nasty looking errors into syslog. However, the code in patch 
> 3/5 (verification and fallback behaviour) falls apart a bit with such 
> configs.
> 
> The idea is to tell the user that his config is wrong, instead of 
> automagically doing the wrong thing™. We only call this on cluster join, 
> so existing clusters are unaffected anyway (although I though about 
> maybe calling it in 'pvecm status', but we wouldn't need to die there, 
> just show the user).
> 
> [0]: "mismatched" corosync.conf (shortened):
> nodelist {
>    node {
>      name: prod1
>      nodeid: 1
>      ring0_addr: 192.168.0.1
>    }
>    node {
>      name: prod2
>      nodeid: 2
>      ring0_addr: 192.168.0.2
>      ring1_addr: 192.168.1.2
>    }
> }
> 
> totem {
>    interface {
>      linknumber: 0
>      linknumber: 4
>    }
>    ...
> }
> 
> TL;DR: Yes, I checked ;) But above should give you the full picture, if 
> there are any concerns/things I missed.
> 
>>> +    push @warnings, "warning: authentication/encryption is not explicitly enabled"
>>> +	. " (secauth / crypto_cipher / crypto_hash)"
>> 
>> you do not check the crypto_hash?
>> 
> 
> Hm, true, I just reused the code from pve5to6.pm:408. But I should have 
> noticed that nonetheless...
> 
> I'll fix it here and in the 5to6 tool, or was there a reason to do it 
> this way there?

IIRC, crypto_cipher depends on crypto_hash being set, and we want 
encryption and not just authentication ;)

> 
>>> +	if (!defined($totem->{secauth}) || $totem->{secauth} ne 'on') &&
>>> +	   (!defined($totem->{crypto_cipher}) || $totem->{crypto_cipher} eq 'none');
>> 
>> This style above is NAK'd ^^
>> Post-if only if the statement it guards is on a single, not long, line.
>> 
>> Either full-blown `if () { .. }` or split it up and do:
>> 
>> push @warnings, "warning: authentication (secauth) is not explicitly enabled"
>>      if (!defined($totem->{secauth}) || $totem->{secauth} ne 'on');
>> 
>> push @warnings, "warning: encryption (crypto_cipher) is not explicitly enabled"
>>      if (!defined($totem->{secauth}) || $totem->{secauth} ne 'on');
>> 
> 
> Ok
> 
>> 
>>> +
>>> +    my $interfaces = $totem->{interface};
>>> +
>>> +    my $verify_link_ip = sub {
>>> +	my ($key, $link, $node) = @_;
>>> +	my ($resolved_ip, undef) = resolve_hostname_like_corosync($link, $conf);
>>> +	if (defined($resolved_ip)) {
>>> +	    push @warnings, "warning: $key '$link' for node '$node' resolves to"
>>> +		. " '$resolved_ip' - consider replacing it with the currently"
>>> +		. " resolved IP address for stability"
>>> +		if $resolved_ip ne $link;
>> 
>> same here, better to just use full-blown if (...) {…}
>> 
>>> +	} else {
>>> +	    push @warnings, "warning: unable to resolve $key '$link' for node '$node'"
>>> +	      . " to an IP address according to Corosync's resolve strategy -"
>>> +	      . " cluster could fail on restart!";
>>> +	}
>>> +    };
>>> +
>>> +    # sort for output order stability
>>> +    my @node_names = sort keys %$nodelist;
>>> +
>>> +    my $node_links = {};
>>> +    foreach my $node (@node_names) {
>>> +	my $options = $nodelist->{$node};
>>> +	foreach my $opt (keys %$options) {
>>> +	    next if $opt !~ $link_addr_re;
>> 
>> maybe a ("private") helper method which return both, type and id, would be nicer,
>> else one asks what $1, or $2 is:
>> 
>> my ($linktype, $linkid) = $parse_link_entry->($opt) or next;
>> 
> 
> I think I use something similar in a later patch, so might even make 
> sense to do it as a public helper. Need to check, but I'll change this 
> part either way.
> 
>> (that with the next is guessed, needs to ensure it has no bad side-effects
>> in combination with the my ())
>> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 




More information about the pve-devel mailing list