[pve-devel] firewall : possible bug/race when cluster.fw is replicated and rules are updated ?

Alexandre DERUMIER aderumier at odiso.com
Wed Jan 9 08:36:51 CET 2019


>>But that is true for file systems in general? Even if you're on a local
>>"standard" filesystem and have a reader/writer process pair you need to ensure
>>some level of atomicity regulation, either a (shared) rw_lock or like in our
>>case, where the reader reopens the file on every read loop iteration anyway a
>>"write to tmp then rename" is enough, as with this the reader never opens the
>>file while write operations are still in flight. So nothing specific to pmxcfs.

Thanks Thomas. I really didn't known about this. (Or didn't remember)

I found a small vim script to do atomic write (.tmp + mv)
https://github.com/vim-scripts/Atomic-Save


>>Hmm, but if one wants to restore the defaults by simply deleting the file he'd 
>>need to restart the firewall daemon too. Not really sure if this is ideal 
>>either... Even if we could do heuristics for if the file was really 
>>removed/truncated (double checks) that would be just feel hacky and as said 
>>above, such actions can get you in trouble with all processes where there are 
>>reader writers, so this should be handled by the one updating the file. 

Ok I understand.
I'm also think of case, where we could have a corosync/network failure, 
where /etc/pve couldn't be mounted anymore or not readable, 
that mean that in this case the firewall will be off too.
That's seem bad for security....



>>But maybe a note like "As with all other filesystems you need to ensure a write 
>>operation is seen atomic by any read process by writing to a temporary file and 
>>then renaming (move) it. 

Sound great :)


----- Mail original -----
De: "Thomas Lamprecht" <t.lamprecht at proxmox.com>
À: "pve-devel" <pve-devel at pve.proxmox.com>, "aderumier" <aderumier at odiso.com>, "Stefan Priebe, Profihost AG" <s.priebe at profihost.ag>
Envoyé: Mercredi 9 Janvier 2019 08:16:46
Objet: Re: [pve-devel] firewall : possible bug/race when cluster.fw is replicated and rules are updated ?

On 1/8/19 10:19 PM, Alexandre DERUMIER wrote: 
>>> or those cases i use something like (pseudocode - i use salt not puppet): 
>>> 
>>> - manage copy of file 
>>> - if file has changed trigger: 
>>> - mv -v $managedfile $realfile 
>>> 
> 
>>> Greets, 
>>> Stefan 
> 
> Thanks Stefan, works fine indeed. 
> 
> I really didn't known/remember that /etc/pve was not atomic without a move. 
> Maybe should we add a warning in documentation about this. 

But that is true for file systems in general? Even if you're on a local 
"standard" filesystem and have a reader/writer process pair you need to ensure 
some level of atomicity regulation, either a (shared) rw_lock or like in our 
case, where the reader reopens the file on every read loop iteration anyway a 
"write to tmp then rename" is enough, as with this the reader never opens the 
file while write operations are still in flight. So nothing specific to pmxcfs. 

> 
> also, maybe for firewall, could we add a protection for cluster.fw ( 

Hmm, but if one wants to restore the defaults by simply deleting the file he'd 
need to restart the firewall daemon too. Not really sure if this is ideal 
either... Even if we could do heuristics for if the file was really 
removed/truncated (double checks) that would be just feel hacky and as said 
above, such actions can get you in trouble with all processes where there are 
reader writers, so this should be handled by the one updating the file. 

But maybe a note like "As with all other filesystems you need to ensure a write 
operation is seen atomic by any read process by writing to a temporary file and 
then renaming (move) it. 


> 
> sub update { 
> my $code = sub { 
> 
> my $cluster_conf = load_clusterfw_conf(); 
> + return if !$cluster_conf 
> 
> my $cluster_options = $cluster_conf->{options}; 
> 
> if (!$cluster_options->{enable}) { 
> PVE::Firewall::remove_pvefw_chains(); 
> return; 
> } 
> 
> 
> ----- Mail original ----- 
> De: "Stefan Priebe, Profihost AG" <s.priebe at profihost.ag> 
> À: "pve-devel" <pve-devel at pve.proxmox.com>, "aderumier" <aderumier at odiso.com>, "Thomas Lamprecht" <thomas at lamprecht.org> 
> Envoyé: Mardi 8 Janvier 2019 21:59:44 
> Objet: Re: [pve-devel] firewall : possible bug/race when cluster.fw is replicated and rules are updated ? 
> 
> Hi Alexandre, 
> 
> 
> Am 08.01.19 um 21:55 schrieb Alexandre DERUMIER: 
>>>> But, file_set_contents - which save_clusterfw_conf uses - does this already[0], 
>>>> so maybe this is the "high-level fuse rename isn't atomic" bug again... 
>>>> May need to take a closer look tomorrow. 
>> 
>> mmm, ok. 
>> 
>> In my case, it was with a simple file copy (cp /tmp/cluster.fw /etc/pve/firewall/cluster.fw). (I manage cluster.fw through puppet for multiple cluster). 
>> can reproduce it too with a simple vi edition. 
>> 
>> I think others users could trigger this too, if they use scripts to automate ipset (blacklist, ....). 
>> 
>> Maybe could it be better to only disable firewall when option is setup with "enabled:0", and if cluster.fw is missing, don't change any rules. 
>> ²²² 
> 
> 
> for those cases i use something like (pseudocode - i use salt not puppet): 
> 
> - manage copy of file 
> - if file has changed trigger: 
> - mv -v $managedfile $realfile 
> 
> 
> Greets, 
> Stefan 
> 
>> 
>> 
>> 
>> ----- Mail original ----- 
>> De: "Thomas Lamprecht" <thomas at lamprecht.org> 
>> À: "pve-devel" <pve-devel at pve.proxmox.com>, "aderumier" <aderumier at odiso.com> 
>> Envoyé: Mardi 8 Janvier 2019 20:58:51 
>> Objet: Re: [pve-devel] firewall : possible bug/race when cluster.fw is replicated and rules are updated ? 
>> 
>> Hi, 
>> 
>> On 1/8/19 7:37 PM, Alexandre DERUMIER wrote: 
>>> I'm able to reproduce with: 
>>> --------------------------- 
>>> on 1 host: 
>>> 
>>> cluster.fw: 
>>> [OPTIONS] 
>>> 
>>> enable: 1 
>>> policy_in: ACCEPT 
>>> 
>>> 
>>> 
>>> 
>>> #!/usr/bin/perl 
>>> 
>>> use IO::File; 
>>> use PVE::Firewall; 
>>> use Data::Dumper; 
>>> use Time::HiRes qw ( time alarm sleep usleep ); 
>>> 
>>> while(1){ 
>>> 
>>> $filename = "/etc/pve/firewall/cluster.fw"; 
>>> 
>>> if (my $fh = IO::File->new($filename, O_RDONLY)) { 
>>> 
>>> $cluster_conf = PVE::Firewall::parse_clusterfw_config($filename, $fh, $verbose); 
>>> my $cluster_options = $cluster_conf->{options}; 
>>> 
>>> if (!$cluster_options->{enable}) { 
>>> print Dumper($cluster_options); 
>>> die "error\n"; 
>>> } 
>>> 
>>> } 
>>> usleep(100); 
>>> }; 
>>> 
>>> 
>>> the script is running fine. 
>>> 
>>> 
>>> on another host, edit the file (simple open/write), 
>>> then the script on first host, return 
>>> 
>>> $VAR1 = {}; 
>>> error 
>> 
>> that is expected, AFAICT, a modify operation shouldn't be: 
>> * read FILE -> modify -> write FILE 
>> but rather: 
>> * read FILE -> modify -> write FILE.TMP -> move FILE.TMP to FILE 
>> if it's wanted that always a valid content is read. Else yes, you may have a small 
>> time window where the file is truncated. 
>> 
>> But, file_set_contents - which save_clusterfw_conf uses - does this already[0], 
>> so maybe this is the "high-level fuse rename isn't atomic" bug again... 
>> May need to take a closer look tomorrow. 
>> 
>> [0]: https://git.proxmox.com/?p=pve-common.git;a=blob;f=src/PVE/Tools.pm;h=accf6539da94d2b5d5b6f4539310fe5c4d526c7e;hb=HEAD#l213 
>> 
>>> 
>>> ----- Mail original ----- 
>>> De: "aderumier" <aderumier at odiso.com> 
>>> À: "pve-devel" <pve-devel at pve.proxmox.com> 
>>> Envoyé: Mardi 8 Janvier 2019 19:15:06 
>>> Objet: [pve-devel] firewall : possible bug/race when cluster.fw is replicated and rules are updated ? 
>>> 
>>> Hi, 
>>> I'm currently debugging a possible firewalling problem. 
>>> I'm running some cephfs client in vm, firewalled by proxmox. 
>>> cephfs client are really sensitive to network problem, and mainly with packets logss or dropped packets. 
>>> 
>>> I'm really not sure, but I have currently puppet updating my cluster.fw, at regular interval, 
>>> and sometimes, I have all the vm on a specific host (or multiple hosts), at the same time, have a small disconnect (maybe some second). 
>>> 
>>> 
>>> I would like to known, if cluster.fw replication is atomic in /etc/pve/ ? 
>>> or if they are any chance, that during file replication, the firewall try to read the file, 
>>> it could be empty ? 
>>> 
>>> 
>>> I just wonder (I'm really really not sure) if I could trigger this: 
>>> 
>>> 
>>> sub update { 
>>> my $code = sub { 
>>> 
>>> my $cluster_conf = load_clusterfw_conf(); 
>>> my $cluster_options = $cluster_conf->{options}; 
>>> 
>>> if (!$cluster_options->{enable}) { 
>>> PVE::Firewall::remove_pvefw_chains(); 
>>> return; 
>>> } 
>>> 
>>> 
>>> cluster.conf not readable/absent/.... , and remove_pvefw_chains called. 
>>> then after some seconds, rules are applied again. 
>>> 
>>> 
>>> I'm going to add some log to try to reproduce it. (BTW, it could be great to logs rules changed, maybe an audit log with a diff could be great) 
>>> _______________________________________________ 
>>> pve-devel mailing list 
>>> pve-devel at pve.proxmox.com 
>>> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 
>>> 
>>> _______________________________________________ 
>>> pve-devel mailing list 
>>> pve-devel at pve.proxmox.com 
>>> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 
>> 
>> _______________________________________________ 
>> pve-devel mailing list 
>> pve-devel at pve.proxmox.com 
>> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 
>> 
> 
> _______________________________________________ 
> pve-devel mailing list 
> pve-devel at pve.proxmox.com 
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 
> 




More information about the pve-devel mailing list