[pve-devel] migration problems since qemu 1.3

Stefan Priebe - Profihost AG s.priebe at profihost.ag
Fri Dec 21 14:46:26 CET 2012


This time it hangs at the first query-migrate:
------------------------------------------
Dec 21 14:44:43 starting migration of VM 100 to node 'cloud1-1203' 
(10.255.0.22)
Dec 21 14:44:43 copying disk images
Dec 21 14:44:43 starting VM 100 on remote node 'cloud1-1203'
Dec 21 14:44:46 starting migration tunnel
Dec 21 14:44:46 starting online/live migration on port 60000
Dec 21 14:44:46 migrate-set-capabilities, capabilities => [HASH(0x3933ed0)]
Dec 21 14:44:46 migrate-set-cache-size, value => 429496729
Dec 21 14:44:46 start migrate tcp:localhost:60000
Dec 21 14:44:48 query-migrate
-------------------------------------------

I can reproduce this by assign min. 4GB of memory to a machine and then 
fill the buffers and cache by:

find / -type f -print |xargs cat >/dev/null

And then start a migrate.

Stefan
Am 21.12.2012 11:43, schrieb Stefan Priebe - Profihost AG:
> Hi Alexandre,
>
> i've added some debugging / logging code.
>
> The output stops / hangs at query migrate. See here:
>
> Dec 21 11:41:59 starting migration of VM 100 to node 'cloud1-1203'
> (10.255.0.22)
> Dec 21 11:41:59 copying disk images
> Dec 21 11:41:59 starting VM 100 on remote node 'cloud1-1203'
> Dec 21 11:42:02 starting migration tunnel
> Dec 21 11:42:03 starting online/live migration on port 60000
> Dec 21 11:42:03 migrate-set-capabilities, capabilities => [HASH(0x39a9fb0)]
> Dec 21 11:42:03 migrate-set-cache-size, value => 429496729
> Dec 21 11:42:03 start migrate tcp:localhost:60000
> Dec 21 11:42:05 query-migrate
> Dec 21 11:42:05 migration status: active (transferred 468063329,
> remaining 3764068352), total 4303814656)
> Dec 21 11:42:07 query-migrate
>
> I can't even ping the VM anymore.
>
> Stefan
>
> Am 21.12.2012 08:58, schrieb Alexandre DERUMIER:
>> Hi Stefan, any news ?
>>
>> I'm trying to reproduce your problem, but it's works fine for me, no
>> crash...
>>
>> ----- Mail original -----
>>
>> De: "Stefan Priebe - Profihost AG" <s.priebe at profihost.ag>
>> À: "Alexandre DERUMIER" <aderumier at odiso.com>
>> Cc: pve-devel at pve.proxmox.com
>> Envoyé: Jeudi 20 Décembre 2012 16:09:42
>> Objet: Re: [pve-devel] migration problems since qemu 1.3
>>
>> Hi,
>> Am 20.12.2012 15:57, schrieb Alexandre DERUMIER:
>>> Just an idea (not sure it's the problem),can you try to commment
>>>
>>> $qmpclient->queue_cmd($vmid, $ballooncb, 'query-balloon');
>>>
>>> in QemuServer.pm, line 2081.
>>>
>>> and restart pvedaemon && pvestatd ?
>>
>> This doesn't change anything.
>>
>> Right now the kvm process is running on old and new machine.
>>
>> An strace on the pid on the new machine shows a loop of:
>>
>> ----------------
>> [pid 28351] <... futex resumed> ) = -1 ETIMEDOUT (Connection timed
>> out)
>> [pid 28351] futex(0x7ff8b8025388, FUTEX_WAKE_PRIVATE, 1) = 0
>> [pid 28351] futex(0x7ff8b8026024,
>> FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 11801, {1356016143,
>> 843092000}, ffffffff <unfinished ...>
>> [pid 28285] mremap(0x7ff77bfe4000, 160378880, 160411648, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 160411648, 160448512, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 160448512, 160481280, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 160481280, 160514048, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 160514048, 160546816, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 160546816, 160583680, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 160583680, 160616448, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 160616448, 160649216, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 160649216, 160681984, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 160681984, 160718848, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 160718848, 160751616, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 160751616, 160784384, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 160784384, 160817152, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 160817152, 160854016, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 160854016, 160886784, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 160886784, 160919552, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 160919552, 160952320, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 160952320, 160989184, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 160989184, 161021952, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 161021952, 161054720, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 161054720, 161087488, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 161087488, 161124352, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 161124352, 161157120, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 161157120, 161189888, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 161189888, 161222656, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 161222656, 161259520, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 161259520, 161292288, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 161292288, 161325056, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28351] <... futex resumed> ) = -1 ETIMEDOUT (Connection timed
>> out)
>> [pid 28351] futex(0x7ff8b8025388, FUTEX_WAKE_PRIVATE, 1) = 0
>> [pid 28351] futex(0x7ff8b8026024,
>> FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 11803, {1356016144,
>> 843283000}, ffffffff <unfinished ...>
>> [pid 28285] mremap(0x7ff77bfe4000, 161325056, 161357824, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 161357824, 161394688, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 161394688, 161427456, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 161427456, 161460224, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28345] <... restart_syscall resumed> ) = -1 ETIMEDOUT (Connection
>> timed out)
>> [pid 28345] futex(0x7ff8caa2e274, FUTEX_CMP_REQUEUE_PRIVATE, 1,
>> 2147483647, 0x7ff8caa2e1b0, 872) = 1
>> [pid 28347] <... futex resumed> ) = 0
>> [pid 28345] futex(0x7ff8caa241a8, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
>> [pid 28347] futex(0x7ff8caa2e1b0, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
>> [pid 28345] <... futex resumed> ) = 0
>> [pid 28347] <... futex resumed> ) = 0
>> [pid 28345] futex(0x7ff8caa2420c,
>> FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 799, {1356016153,
>> 954319000}, ffffffff <unfinished ...>
>> [pid 28347] sendmsg(19, {msg_name(0)=NULL, msg_iov(1)=[{"\t", 1}],
>> msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 1
>> [pid 28347] futex(0x7ff8caa2e274, FUTEX_WAIT_PRIVATE, 873, NULL
>> <unfinished ...>
>> [pid 28285] mremap(0x7ff77bfe4000, 161460224, 161492992, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 161492992, 161529856, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 161529856, 161562624, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 161562624, 161595392, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 161595392, 161628160, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 161628160, 161665024, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 161665024, 161697792, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 161697792, 161730560, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 161730560, 161763328, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 161763328, 161800192, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 161800192, 161832960, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 161832960, 161865728, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 161865728, 161898496, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> [pid 28285] mremap(0x7ff77bfe4000, 161898496, 161935360, MREMAP_MAYMOVE)
>> = 0x7ff77bfe4000
>> -----------------------
>>
>>
>> Stefan
>>



More information about the pve-devel mailing list