IPsrcaddr : retource stop problem #2019

Rico29 · 2025-01-27T12:02:01Z

Hello,
I have an issue with IPsrcaddr resource on latest debian 12 with up-to-date packages

I've pulled findif.sh and IPsrcaddr from ClusterLabs/resource-agents github repo

Problem occurs when moving a resource (or resource group) to another node.

Reproduction :

node 1 :

root@freepbx-lab-ha1:~# crm status
[...]
Node List:
  * Online: [ freepbx-lab-ha1 freepbx-lab-ha2 ]

Full List of Resources:
  * email_alert (ocf:heartbeat:MailTo):  Started freepbx-lab-ha2
  * Resource Group: grp_services:
    * shared_ip (ocf:heartbeat:IPaddr2):         Started freepbx-lab-ha1
    * src_ip    (ocf:heartbeat:IPsrcaddr):       Started freepbx-lab-ha1
    * srv_freepbx       (systemd:freepbx):       Started freepbx-lab-ha1

root@freepbx-lab-ha1:~# ip a
[...]
3: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether bc:24:11:6b:df:9e brd ff:ff:ff:ff:ff:ff
    inet 192.168.222.211/24 brd 192.168.222.255 scope global bond0
       valid_lft forever preferred_lft forever
    inet 192.168.222.210/32 brd 192.168.222.255 scope global bond0
       valid_lft forever preferred_lft forever

root@freepbx-lab-ha1:~# ip r
default via 192.168.222.1 dev bond0 proto keepalived src 192.168.222.210 onlink 
192.168.222.0/24 dev bond0 proto keepalived scope link src 192.168.222.210 
192.168.222.212 dev bond0 scope link src 192.168.222.211

node 2 :

root@freepbx-lab-ha2:~# crm status
[...]
Node List:
  * Online: [ freepbx-lab-ha1 freepbx-lab-ha2 ]

Full List of Resources:
  * email_alert (ocf:heartbeat:MailTo):  Started freepbx-lab-ha2
  * Resource Group: grp_services:
    * shared_ip (ocf:heartbeat:IPaddr2):         Started freepbx-lab-ha1
    * src_ip    (ocf:heartbeat:IPsrcaddr):       Started freepbx-lab-ha1
    * srv_freepbx       (systemd:freepbx):       Started freepbx-lab-ha1

root@freepbx-lab-ha2:~# ip a
[...]
3: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether bc:24:11:bf:7d:11 brd ff:ff:ff:ff:ff:ff
    inet 192.168.222.212/24 brd 192.168.222.255 scope global bond0
       valid_lft forever preferred_lft forever

root@freepbx-lab-ha2:~# ip r
default via 192.168.222.1 dev bond0 proto keepalived src 192.168.222.212 onlink 
192.168.222.0/24 dev bond0 proto kernel scope link src 192.168.222.212 
192.168.222.211 dev bond0 scope link src 192.168.222.212

At startup, everything is working correctly. node1 owns the IPaddr2 address, and default route uses this address as src address

When moving resource group via command crm resource move grp_services freepbx-lab-ha2 , I get this status and error

Node List:
  * Online: [ freepbx-lab-ha1 freepbx-lab-ha2 ]

Full List of Resources:
  * email_alert (ocf:heartbeat:MailTo):  Started freepbx-lab-ha2
  * Resource Group: grp_services:
    * shared_ip (ocf:heartbeat:IPaddr2):         Started freepbx-lab-ha1
    * src_ip    (ocf:heartbeat:IPsrcaddr):       FAILED freepbx-lab-ha1 (blocked)
    * srv_freepbx       (systemd:freepbx):       Stopped

Failed Resource Actions:
  * src_ip stop on freepbx-lab-ha1 returned 'error' (command 'ip route replace  192.168.222.0/24 dev bond0 proto kernel scope link src 192.168.222.211) at Mon Jan 27 12:59
:40 2025 after 48ms

running the "ip route..." command manually returns no error on given node :

# ip route replace  192.168.222.0/24 dev bond0 proto kernel scope link src 192.168.222.211 && echo $?
0

How can I fix this ?
Regards

The text was updated successfully, but these errors were encountered:

oalbrigt · 2025-01-27T14:24:40Z

If you run pcs resource update src_ip trace_ra=1 or the crm equivalent you will get trace-files for every run of each action in /var/lib/heartbeat.

Then you try to move it again and should be able to identify exactly command fails.

Rico29 · 2025-02-26T13:45:16Z

Hello
Sorry for late response, was working on other things. but I've found the problem.

IPaddr2 resource adds a virtual IP address to the interface, but this ip address is not marked as "secondary". so the PRIMARY_IP variable contains 2 ip addresses, the original one (defined in host network config) and the ip address added by IPaddr2 resource :

# ip -4 -o addr show dev bond0.324 primary | awk '{split($4,a,"/");print a[1]}'
172.24.0.2
172.24.0.1

So the fix you proposed id #1450 is working fine.

But is that the good solution ? Wouldn't be IPaddr2 to set up the virtual ip address as "secondary" ?

when adding the IP address as secondary, I get this warning :

# ip addr add 172.24.0.1/16 dev bond0.324 secondary
Warning: secondary option is not mutable from userspace

But the IP address is correctly added as "secondary"

# ip -4 -o addr show dev bond0.324
6: bond0.324    inet 172.24.0.2/16 brd 172.24.255.255 scope global bond0.324\       valid_lft forever preferred_lft forever
6: bond0.324    inet 172.24.0.1/16 scope global secondary bond0.324\       valid_lft forever preferred_lft forever

and the awk in IPsrcaddr works as expected and only returns the primary ip address.

So, IPaddr2 responsability or IPsrcaddr responsability ?
Regards

oalbrigt · 2025-02-27T08:58:41Z

That should be in the IPaddr2 agent.

I'll look into adding the logic to add the IP as secondary.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IPsrcaddr : retource stop problem #2019

IPsrcaddr : retource stop problem #2019

Rico29 commented Jan 27, 2025

oalbrigt commented Jan 27, 2025

Rico29 commented Feb 26, 2025

oalbrigt commented Feb 27, 2025

IPsrcaddr : retource stop problem #2019

IPsrcaddr : retource stop problem #2019

Comments

Rico29 commented Jan 27, 2025

oalbrigt commented Jan 27, 2025

Rico29 commented Feb 26, 2025

oalbrigt commented Feb 27, 2025