[pve-devel] [PATCH cluster v2] add generic data broadcast interface

Thomas Lamprecht t.lamprecht at proxmox.com
Tue May 7 19:16:13 CEST 2019


On 5/3/19 12:32 PM, Dominik Csapak wrote:
> similar to how we handle the cluster wide tasklist and rrd data,
> have an interface that can sync data across the cluster
> 
> this data is only transient and will not be written to disk
> 
> we can use this for a number of things, e.g. getting the locks of the
> guests clusterwide, listing ceph services across the cluster, etc.

commit message wasn't updated respective to our discussion outcome, i.e.,
locks are still mentioned here. After a closer look a few implementation
detail specific comments inline.

> 
> Signed-off-by: Dominik Csapak <d.csapak at proxmox.com>
> ---
> sending it alone, awaiting review of the rest of my series
> changes from v1:
> * use modern sub syntax
> * better naming
> 
>  data/PVE/Cluster.pm | 47 +++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 47 insertions(+)
> 
> diff --git a/data/PVE/Cluster.pm b/data/PVE/Cluster.pm
> index 5af11e6..a50785d 100644
> --- a/data/PVE/Cluster.pm
> +++ b/data/PVE/Cluster.pm
> @@ -540,6 +540,53 @@ sub get_nodelist {
>      return [ keys %$nodelist ];
>  }
>  
> +# best effort data store for cluster
> +# this data is gone if the pmxcfs is restarted, but only the local data,
> +# so we should not use this for very important data
> +sub broadcast_node_kv {
> +    my ($key, $data) = @_;
> +
> +    my $size = length(encode_json($data));
> +    if ($size >= (32 * 1024)) {

why this hardcoded size limit, just "semi-random" upper one or?

Do not like that we encode it here and then again in $ipcc_update_status ..
Could add an optional size $param there (or have a always-on there?)?

> +	warn "data for '$key' too big\n";
> +	return;
> +    }
> +
> +    eval {
> +	$ipcc_update_status->("kv/$key", $data);
> +    };
> +
> +    warn $@ if $@;
> +}
> +
> +sub get_node_kv {
> +    my ($key, $nodename) = @_;
> +
> +    my $res = {};
> +    my $get_node_data = sub {
> +	my ($node) = @_;
> +	eval {
> +	    my $raw = $ipcc_get_status->("kv/$key", $node);
> +	    my $data = decode_json($raw) if $raw;

post if is dangerous, as noted a few times on this list already, e.g.:
https://pve.proxmox.com/pipermail/pve-devel/2019-April/036718.html

if the prev loop call to this resulted some valid $data, and this one fails,
we'll record the old $data to the current $node's $res entry...

just a:
$res->{$node} = decode_json($raw) if $raw;

saves a line and would not have this problematic..

> +	    $res->{$node} = $data;
> +	};
> +	my $err = $@;
> +	syslog('err', $err) if $err;

hmm, not 100% sure about this error handling... why not just warn or
plainly die? Decode can only fail if something did not get broadcast
through broadcast_node_kv or pmxcfs got an issue, and the ipcc call
fails if IPCC is broken or it cannot be reconnected after more than
10 seconds, i.e., pmxcfs is probably dead. For both you may want to
notice your caller about the fact that the data is corrupted/invalid?
There we can still catch this then, and make a distinction between failed
and no data? 

> +    };
> +
> +    if ($nodename) {
> +	$sub->($nodename);
> +    } else {
> +	my $nodelist = get_nodelist();
> +
> +	foreach my $node (@$nodelist) {
> +	    $get_node_data->($node);
> +	}
> +    }
> +
> +    return $res;
> +}
> +
>  # $data must be a chronological descending ordered array of tasks
>  sub broadcast_tasklist {
>      my ($data) = @_;
> 





More information about the pve-devel mailing list