Tuesday, 22 November 2016

Rune Audio, RPi and networking recovery

It seems that if the wireless/wired router is rebooted, then all the Rune Audio instances have to be restarted to link up with the wired interface (and possibly the wireless). If this is true, this is very boring. It certainly seems to be true - I have to reboot the router sometimes because it gets upset and won't connect my phone (1st world problems eh??).

So I'm looking at the networking setup in Rune - here's the Rune debug first few lines:

###### System info ######
Linux version 4.1.20-1-ARCH (builduser@leming) (gcc version 5.3.0 (GCC) ) #1 SMP Sat Mar 19 20:07:23 MDT 2016

system time: Tue Nov 22 15:54:10 GMT 2016

system load: 0.19 0.10 0.06 1/181 3960

HW platform: RaspberryPi2 (08)

playerID: 08b26fff2ac4ddac6fee7fbbc1890504ae
I'm using the Office system, because that's the one I have configured with
  • A wired connection through an Airport Express, so it's actually wireless but looks like a wired connection to the RPi
  • A wireless connection to the Netgear router
You can see these here:

[root@office ~]# ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.1.12  netmask 255.255.255.0  broadcast 192.168.1.255
        ether b8:27:eb:d3:a9:ca  txqueuelen 1000  (Ethernet)
        RX packets 747196  bytes 156859854 (149.5 MiB)
        RX errors 0  dropped 8  overruns 0  frame 0
        TX packets 573333  bytes 203126147 (193.7 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

wlan0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.1.14  netmask 255.255.255.0  broadcast 192.168.1.255
        ether b8:27:eb:86:fc:9f  txqueuelen 1000  (Ethernet)
        RX packets 31182  bytes 3441166 (3.2 MiB)
        RX errors 0  dropped 3566  overruns 0  frame 0
        TX packets 2181  bytes 412069 (402.4 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

I can't see them properly in the router - it appears to be a bit confused (maybe it's the fricking router that's the problem...) - here's the devices connected

Wired Devices
#IP AddressDevice Name MAC Address
1192.168.1.3serverB8:27:EB:15:6E:B7
2192.168.1.4FrontroomB8:27:EB:25:C4:1A
3192.168.1.11FRONTROOMB8:27:EB:25:C4:1A
 Wireless Devices (Wireless intruders also show up here)
#IP AddressDevice Name MAC Address
1192.168.1.12officeB8:27:EB:D3:A9:CA
2192.168.1.2johns-airport-express00:1F:F3:07:9F:EB
3----B8:27:EB:86:FC:9F
4192.168.1.13meye-55ace5a800:0F:60:06:7E:10
5----B8:27:EB:70:91:4F
6192.168.1.10meye-e5167b3374:DA:38:3A:33:7C
7192.168.1.7Macintosh-GoukD8:A2:5E:97:E7:E6
8192.168.1.6Jenny-PC1C:65:9D:48:E0:77
9----F0:DB:F8:69:8E:97

Obviously, 192.168.1.12 is the wired connection (eth0), it shows up here because it's via the Airport Express (192.168.1.2). The wireless connection (wlan0) is actually at 192.168.1.14, on B8:27:EB:86:FC:9F - I know because I can telnet into it!! Weird. The router is shyte, really!!

Anyway, I have been testing the unplug/replug of the wired connection, and the unplug/replug of the Airport Express, thereby simulating the two different situations with the usual router. It's hard to do this for real because if I unplug the router I lose connectivity to the RPi. I guess I could look at the logs after the fact, if they survive a reboot, I think they do.

Commands Used

Rune uses ArchLinux. The commands I'm using to investigate the networking activity are:
  • journalctl - enables listing of the system journal. Use -f to tail the journal
  • message <your message> - sends a message to the system journal, very handy for documenting equipment state changes in the log alongside the system's reaction to them

Results

Interesting! The first time I did it, the eth0 interface didn't seem to reconnect (highlighted time a result of search hit on "15:40"!); irrelevant entries deleted:

Cable unplugged
Nov 22 15:40:08 office dhcpcd[1057]: eth0: carrier lost
Nov 22 15:40:08 office kernel: smsc95xx 1-1.1:1.0 eth0: link down
Nov 22 15:40:09 office dhcpcd[1057]: eth0: deleting route to 192.168.1.0/24
Nov 22 15:40:09 office dhcpcd[1057]: eth0: deleting default route via 192.168.1.1
Nov 22 15:40:09 office ifplugd[224]: Link beat lost.
Nov 22 15:40:15 office ifplugd[224]: Executing '/etc/ifplugd/netctl.action eth0 down'.
Nov 22 15:40:15 office ifplugd[224]: client: Stopping network profile 'eth0'...
Nov 22 15:40:15 office dhcpcd[474]: sending signal TERM to pid 1057
Nov 22 15:40:15 office dhcpcd[474]: waiting for pid 1057 to exit

Nov 22 15:40:15 office dhcpcd[1057]: received SIGTERM, stopping
Nov 22 15:40:15 office dhcpcd[1057]: eth0: removing interface
Nov 22 15:40:15 office dhcpcd[1057]: dhcpcd exited
Nov 22 15:40:15 office ifplugd[224]: client: Stopped network profile 'eth0'
Nov 22 15:40:15 office kernel: smsc95xx 1-1.1:1.0 eth0: hardware isn't capable of remote wakeup
Nov 22 15:40:15 office ifplugd[224]: Program executed successfully.
Nov 22 15:40:15 office kernel: smsc95xx 1-1.1:1.0 eth0: hardware isn't capable of remote wakeup
Cable replugged
Nov 22 15:40:25 office kernel: smsc95xx 1-1.1:1.0 eth0: link up, 100Mbps, full-duplex, lpa 0x45E1
Nov 22 15:40:25 office ifplugd[224]: Link beat detected.
Nov 22 15:40:26 office ifplugd[224]: Executing '/etc/ifplugd/netctl.action eth0 up'.
Nov 22 15:40:26 office ifplugd[224]: client: Starting network profile 'eth0'...
Nov 22 15:40:26 office dhcpcd[511]: DUID 00:01:00:01:1c:72:2b:ea:06:92:c5:a0:0d:d6
Nov 22 15:40:26 office dhcpcd[511]: eth0: IAID eb:d3:a9:ca
Nov 22 15:40:26 office dhcpcd[511]: eth0: rebinding lease of 192.168.1.12
Nov 22 15:40:31 office dhcpcd[511]: eth0: NAK: from 192.168.1.1
Nov 22 15:40:31 office dhcpcd[511]: eth0: soliciting a DHCP lease
Nov 22 15:40:32 office dhcpcd[511]: eth0: offered 192.168.1.12 from 192.168.1.1
Nov 22 15:40:32 office dhcpcd[511]: eth0: probing address 192.168.1.12/24
Nov 22 15:40:37 office dhcpcd[511]: eth0: leased 192.168.1.12 for 86400 seconds
Nov 22 15:40:37 office dhcpcd[511]: eth0: adding route to 192.168.1.0/24
Nov 22 15:40:37 office dhcpcd[511]: eth0: adding default route via 192.168.1.1
Nov 22 15:40:37 office dhcpcd[511]: forked to background, child pid 582

Nov 22 15:40:37 office ifplugd[224]: client: Started network profile 'eth0'
Nov 22 15:40:37 office ifplugd[224]: Program executed successfully.
Router unplugged

Nov 22 15:41:03 office dhcpcd[582]: eth0: carrier lost
Nov 22 15:41:03 office kernel: smsc95xx 1-1.1:1.0 eth0: link down
Nov 22 15:41:03 office ifplugd[224]: Link beat lost.
Nov 22 15:41:03 office dhcpcd[582]: eth0: deleting route to 192.168.1.0/24
Nov 22 15:41:03 office dhcpcd[582]: eth0: deleting default route via 192.168.1.1
Nov 22 15:41:09 office ifplugd[224]: Executing '/etc/ifplugd/netctl.action eth0 down'.
Nov 22 15:41:09 office ifplugd[224]: client: Stopping network profile 'eth0'...
Nov 22 15:41:09 office dhcpcd[653]: sending signal TERM to pid 582
Nov 22 15:41:09 office dhcpcd[653]: waiting for pid 582 to exit
Nov 22 15:41:09 office dhcpcd[582]: received SIGTERM, stopping
Nov 22 15:41:09 office dhcpcd[582]: eth0: removing interface
Nov 22 15:41:09 office dhcpcd[582]: dhcpcd exited
Nov 22 15:41:09 office kernel: smsc95xx 1-1.1:1.0 eth0: hardware isn't capable of remote wakeup
Nov 22 15:41:09 office ifplugd[224]: client: Stopped network profile 'eth0'

Nov 22 15:41:10 office ifplugd[224]: Program executed successfully.
Nov 22 15:41:10 office kernel: smsc95xx 1-1.1:1.0 eth0: hardware isn't capable of remote wakeup
Nov 22 15:42:03 office kernel: smsc95xx 1-1.1:1.0 eth0: link up, 10Mbps, half-duplex, lpa 0x0020
Router replugged
Nov 22 15:42:04 office ifplugd[224]: Link beat detected.
Nov 22 15:42:05 office ifplugd[224]: Executing '/etc/ifplugd/netctl.action eth0 up'.
Nov 22 15:42:05 office ifplugd[224]: client: Starting network profile 'eth0'...
Nov 22 15:42:05 office dhcpcd[707]: DUID 00:01:00:01:1c:72:2b:ea:06:92:c5:a0:0d:d6
Nov 22 15:42:05 office dhcpcd[707]: eth0: IAID eb:d3:a9:ca
Nov 22 15:42:05 office dhcpcd[707]: eth0: rebinding lease of 192.168.1.12
Nov 22 15:42:10 office dhcpcd[707]: eth0: DHCP lease expired
Nov 22 15:42:10 office dhcpcd[707]: eth0: soliciting a DHCP lease
Nov 22 15:42:18 office dhcpcd[707]: eth0: carrier lost
Nov 22 15:42:18 office kernel: smsc95xx 1-1.1:1.0 eth0: link down
Nov 22 15:42:20 office dhcpcd[707]: eth0: carrier acquired
Nov 22 15:42:20 office kernel: smsc95xx 1-1.1:1.0 eth0: link up, 100Mbps, full-duplex, lpa 0x45E1
Nov 22 15:42:21 office dhcpcd[707]: eth0: IAID eb:d3:a9:ca
Nov 22 15:42:21 office dhcpcd[707]: eth0: soliciting a DHCP lease
Nov 22 15:42:35 office dhcpcd[707]: timed out
Nov 22 15:42:35 office dhcpcd[707]: dhcpcd exited
Nov 22 15:42:35 office ifplugd[224]: client: <3>DHCP IPv4 lease attempt failed on interface 'eth0'
Nov 22 15:42:35 office ifplugd[224]: client: <3>Failed to bring the network up for profile 'eth0'
Nov 22 15:42:35 office ifplugd[224]: client: <3>Could not start any suitable profile
Nov 22 15:42:35 office ifplugd[224]: Program execution failed, return value is 1.

Nov 22 15:42:35 office kernel: smsc95xx 1-1.1:1.0 eth0: hardware isn't capable of remote wakeup
Nov 22 15:42:35 office ifplugd[224]: Link beat lost.
Nov 22 15:42:35 office kernel: smsc95xx 1-1.1:1.0 eth0: hardware isn't capable of remote wakeup
Nov 22 15:42:37 office kernel: smsc95xx 1-1.1:1.0 eth0: link up, 100Mbps, full-duplex, lpa 0x45E1
Nov 22 15:42:37 office ifplugd[224]: Link beat detected.
At this point, ifconfig was not showing any IP address associated with eth0, although it was shown as "running"... So I invoked netctl stop/start eth0... Maybe I should have waited longer?

Nov 22 15:45:59 office systemd[1]: Stopped Networking for netctl profile eth0.
Nov 22 15:46:08 office systemd[1]: Created slice system-netctl.slice.
Nov 22 15:46:08 office systemd[1]: Starting Networking for netctl profile eth0...
Nov 22 15:46:08 office network[917]: Starting network profile 'eth0'...
Nov 22 15:46:08 office dhcpcd[922]: DUID 00:01:00:01:1c:72:2b:ea:06:92:c5:a0:0d:d6
Nov 22 15:46:08 office dhcpcd[922]: eth0: IAID eb:d3:a9:ca

Nov 22 15:46:09 office dhcpcd[922]: eth0: soliciting a DHCP lease
Nov 22 15:46:14 office dhcpcd[922]: eth0: offered 192.168.1.12 from 192.168.1.1
Nov 22 15:46:14 office dhcpcd[922]: eth0: probing address 192.168.1.12/24
Nov 22 15:46:18 office dhcpcd[922]: eth0: leased 192.168.1.12 for 86400 seconds
Nov 22 15:46:18 office dhcpcd[922]: eth0: adding route to 192.168.1.0/24
Nov 22 15:46:18 office dhcpcd[922]: eth0: adding default route via 192.168.1.1
Nov 22 15:46:19 office dhcpcd[922]: forked to background, child pid 969
Nov 22 15:46:19 office network[917]: Started network profile 'eth0'
Nov 22 15:46:19 office systemd[1]: Started Networking for netctl profile eth0.

And after that it was running ok. But I had intervened...

So time for a better documented test. I switched wlan0 to the normal router (192.1.68.1.xxx) first, otherwise I couldn't talk to it properly.

Nov 22 16:02:05 office root[4543]: +++++++ Cable unplugged
Nov 22 16:02:31 office dhcpcd[969]: eth0: carrier lost
Nov 22 16:02:31 office kernel: smsc95xx 1-1.1:1.0 eth0: link down
Nov 22 16:02:31 office dhcpcd[969]: eth0: deleting route to 192.168.1.0/24
Nov 22 16:02:31 office dhcpcd[969]: eth0: deleting default route via 192.168.1.1
Nov 22 16:02:32 office ifplugd[224]: Link beat lost.
Nov 22 16:02:33 office ntpd[284]: Deleting interface #6 eth0, 192.168.1.12#123, interface stats: received=9, sent=10, dropped=0, active_time=973 secs
Nov 22 16:02:38 office ifplugd[224]: Executing '/etc/ifplugd/netctl.action eth0 down'.
Nov 22 16:02:38 office ifplugd[224]: Program execution failed, return value is 1.

Nov 22 16:02:59 office root[4612]: +++++++ Cable replugged
Nov 22 16:03:04 office dhcpcd[969]: eth0: carrier acquired
Nov 22 16:03:04 office kernel: smsc95xx 1-1.1:1.0 eth0: link up, 100Mbps, full-duplex, lpa 0x45E1
Nov 22 16:03:04 office dhcpcd[969]: eth0: IAID eb:d3:a9:ca
Nov 22 16:03:04 office ifplugd[224]: Link beat detected.
Nov 22 16:03:04 office dhcpcd[969]: eth0: rebinding lease of 192.168.1.12
Nov 22 16:03:05 office ifplugd[224]: Executing '/etc/ifplugd/netctl.action eth0 up'.
Nov 22 16:03:05 office ifplugd[224]: client: Starting network profile 'eth0'...
Nov 22 16:03:05 office dhcpcd[4633]: dhcpcd already running on pid 969 (/run/dhcpcd-eth0-4.pid)
Nov 22 16:03:05 office ifplugd[224]: client: <3>DHCP IPv4 lease attempt failed on interface 'eth0'
Nov 22 16:03:05 office dhcpcd[969]: eth0: carrier lost
Nov 22 16:03:05 office kernel: smsc95xx 1-1.1:1.0 eth0: hardware isn't capable of remote wakeup
Nov 22 16:03:05 office ifplugd[224]: client: <3>Failed to bring the network up for profile 'eth0'
Nov 22 16:03:05 office ifplugd[224]: client: <3>Could not start any suitable profile
Nov 22 16:03:05 office ifplugd[224]: Program execution failed, return value is 1.
Nov 22 16:03:05 office ifplugd[224]: Link beat lost.
Nov 22 16:03:05 office kernel: smsc95xx 1-1.1:1.0 eth0: hardware isn't capable of remote wakeup
Nov 22 16:03:05 office dhcpcd[969]: eth0: carrier acquired
Nov 22 16:03:05 office dhcpcd[969]: eth0: IAID eb:d3:a9:ca
Nov 22 16:03:05 office dhcpcd[969]: eth0: rebinding lease of 192.168.1.12
Nov 22 16:03:07 office kernel: smsc95xx 1-1.1:1.0 eth0: link up, 100Mbps, full-duplex, lpa 0x45E1
Nov 22 16:03:07 office ifplugd[224]: Link beat detected.
Nov 22 16:03:10 office dhcpcd[969]: eth0: DHCP lease expired
Nov 22 16:03:10 office dhcpcd[969]: eth0: soliciting a DHCP lease
Nov 22 16:03:15 office dhcpcd[969]: eth0: offered 192.168.1.12 from 192.168.1.1
Nov 22 16:03:15 office dhcpcd[969]: eth0: probing address 192.168.1.12/24
Nov 22 16:03:19 office dhcpcd[969]: eth0: leased 192.168.1.12 for 86400 seconds
Nov 22 16:03:19 office dhcpcd[969]: eth0: adding route to 192.168.1.0/24
Nov 22 16:03:19 office dhcpcd[969]: eth0: adding default route via 192.168.1.1
Nov 22 16:03:19 office dhcpcd[969]: eth0: removing route to 192.168.1.0/24

Looks like the cable un/replug works like last time... Try the router

Nov 22 16:06:05 office root[4784]: +++++++ Router unplugged
Nov 22 16:06:11 office dhcpcd[969]: eth0: carrier lost
Nov 22 16:06:11 office kernel: smsc95xx 1-1.1:1.0 eth0: link down
Nov 22 16:06:11 office dhcpcd[969]: eth0: deleting default route via 192.168.1.1
Nov 22 16:06:11 office avahi-daemon[221]: Withdrawing address record for 192.168.1.12 on eth0.
Nov 22 16:06:11 office avahi-daemon[221]: Leaving mDNS multicast group on interface eth0.IPv4 with address 192.168.1.12.
Nov 22 16:06:11 office avahi-daemon[221]: Interface eth0.IPv4 no longer relevant for mDNS.
Nov 22 16:06:11 office ifplugd[224]: Link beat lost.
Nov 22 16:06:13 office ntpd[284]: Deleting interface #9 eth0, 192.168.1.12#123, interface stats: received=0, sent=0, dropped=0, active_time=172 secs
Nov 22 16:06:17 office ifplugd[224]: Executing '/etc/ifplugd/netctl.action eth0 down'.
Nov 22 16:06:17 office ifplugd[224]: Program execution failed, return value is 1.
Nov 22 16:07:25 office root[4866]: +++++++ Router replugged
Nov 22 16:08:08 office dhcpcd[969]: eth0: carrier acquired
Nov 22 16:08:08 office kernel: smsc95xx 1-1.1:1.0 eth0: link up, 10Mbps, half-duplex, lpa 0x0020
Nov 22 16:08:09 office ifplugd[224]: Link beat detected.
Nov 22 16:08:09 office dhcpcd[969]: eth0: IAID eb:d3:a9:ca
Nov 22 16:08:09 office dhcpcd[969]: eth0: rebinding lease of 192.168.1.12
Nov 22 16:08:10 office ifplugd[224]: Executing '/etc/ifplugd/netctl.action eth0 up'.
Nov 22 16:08:10 office ifplugd[224]: client: Starting network profile 'eth0'...
Nov 22 16:08:10 office dhcpcd[4911]: dhcpcd already running on pid 969 (/run/dhcpcd-eth0-4.pid)
Nov 22 16:08:10 office ifplugd[224]: client: <3>DHCP IPv4 lease attempt failed on interface 'eth0'
Nov 22 16:08:10 office dhcpcd[969]: eth0: carrier lost
Nov 22 16:08:10 office ifplugd[224]: client: <3>Failed to bring the network up for profile 'eth0'
Nov 22 16:08:10 office ifplugd[224]: client: <3>Could not start any suitable profile
Nov 22 16:08:10 office ifplugd[224]: Program execution failed, return value is 1.
Nov 22 16:08:10 office kernel: smsc95xx 1-1.1:1.0 eth0: hardware isn't capable of remote wakeup
Nov 22 16:08:10 office ifplugd[224]: Link beat lost.
Nov 22 16:08:10 office kernel: smsc95xx 1-1.1:1.0 eth0: hardware isn't capable of remote wakeup
Nov 22 16:08:10 office dhcpcd[969]: eth0: carrier acquired
Nov 22 16:08:10 office dhcpcd[969]: eth0: IAID eb:d3:a9:ca
Nov 22 16:08:10 office dhcpcd[969]: eth0: rebinding lease of 192.168.1.12
Nov 22 16:08:12 office kernel: smsc95xx 1-1.1:1.0 eth0: link up, 10Mbps, half-duplex, lpa 0x0020
Nov 22 16:08:13 office ifplugd[224]: Link beat detected.
Nov 22 16:08:15 office dhcpcd[969]: eth0: DHCP lease expired
Nov 22 16:08:15 office dhcpcd[969]: eth0: soliciting a DHCP lease
Nov 22 16:08:23 office dhcpcd[969]: eth0: carrier lost
Nov 22 16:08:23 office kernel: smsc95xx 1-1.1:1.0 eth0: link down
Nov 22 16:08:24 office ifplugd[224]: Link beat lost.
Nov 22 16:08:26 office dhcpcd[969]: eth0: carrier acquired
Nov 22 16:08:26 office kernel: smsc95xx 1-1.1:1.0 eth0: link up, 100Mbps, full-duplex, lap 0x45E1
Nov 22 16:08:26 office dhcpcd[969]: eth0: IAID eb:d3:a9:ca
Nov 22 16:08:26 office dhcpcd[969]: eth0: soliciting a DHCP lease
Nov 22 16:08:27 office ifplugd[224]: Link beat detected.
Nov 22 16:08:54 office dhcpcd[969]: eth0: offered 192.168.1.12 from 192.168.1.1
Nov 22 16:08:54 office dhcpcd[969]: eth0: probing address 192.168.1.12/24
Nov 22 16:08:59 office dhcpcd[969]: eth0: leased 192.168.1.12 for 86400 seconds
Nov 22 16:08:59 office dhcpcd[969]: eth0: adding route to 192.168.1.0/24
Nov 22 16:08:59 office dhcpcd[969]: eth0: adding default route via 192.168.1.1
Nov 22 16:08:59 office avahi-daemon[221]: Joining mDNS multicast group on interface eth0.IPv4 with address 192.168.1.12.
Nov 22 16:08:59 office avahi-daemon[221]: New relevant interface eth0.IPv4 for mDNS.
Nov 22 16:08:59 office avahi-daemon[221]: Registering new address record for 192.168.1.12 on eth0.IPv4.
Nov 22 16:08:59 office dhcpcd[969]: eth0: removing route to 192.168.1.0/24
Nov 22 16:09:01 office ntpd[284]: Listen normally on 10 eth0 192.168.1.12:123
Nov 22 16:09:01 office ntpd[284]: new interface(s) found: waking up resolver
Nov 22 16:13:41 office root[5157]: +++++++ Router unplugged
Nov 22 16:13:48 office dhcpcd[969]: eth0: carrier lost
Nov 22 16:13:48 office kernel: smsc95xx 1-1.1:1.0 eth0: link down
Nov 22 16:13:48 office dhcpcd[969]: eth0: deleting default route via 192.168.1.1
Nov 22 16:13:48 office avahi-daemon[221]: Withdrawing address record for 192.168.1.12 on eth0.
Nov 22 16:13:48 office avahi-daemon[221]: Leaving mDNS multicast group on interface eth0.IPv4 with address 192.168.1.12.
Nov 22 16:13:48 office avahi-daemon[221]: Interface eth0.IPv4 no longer relevant for mDNS.
Nov 22 16:13:48 office ifplugd[224]: Link beat lost.

Nov 22 16:13:54 office ifplugd[224]: Executing '/etc/ifplugd/netctl.action eth0 down'.
Nov 22 16:13:54 office ifplugd[224]: Program execution failed, return value is 1.
Nov 22 16:14:30 office root[5224]: +++++++ Router replugged
Nov 22 16:15:16 office dhcpcd[969]: eth0: carrier acquired
Nov 22 16:15:16 office ifplugd[224]: Link beat detected.
Nov 22 16:15:16 office kernel: smsc95xx 1-1.1:1.0 eth0: link up, 10Mbps, half-duplex, lpa 0x0020
Nov 22 16:15:16 office dhcpcd[969]: eth0: IAID eb:d3:a9:ca
Nov 22 16:15:17 office dhcpcd[969]: eth0: rebinding lease of 192.168.1.12
Nov 22 16:15:17 office ifplugd[224]: Executing '/etc/ifplugd/netctl.action eth0 up'.
Nov 22 16:15:17 office ifplugd[224]: client: Starting network profile 'eth0'...
Nov 22 16:15:17 office dhcpcd[5261]: dhcpcd already running on pid 969 (/run/dhcpcd-eth0-4.pid)
Nov 22 16:15:17 office ifplugd[224]: client: <3>DHCP IPv4 lease attempt failed on interface 'eth0'
Nov 22 16:15:17 office dhcpcd[969]: eth0: carrier lost
Nov 22 16:15:17 office ifplugd[224]: client: <3>Failed to bring the network up for profile 'eth0'
Nov 22 16:15:17 office kernel: smsc95xx 1-1.1:1.0 eth0: hardware isn't capable of remote wakeup
Nov 22 16:15:17 office ifplugd[224]: client: <3>Could not start any suitable profile
Nov 22 16:15:17 office ifplugd[224]: Program execution failed, return value is 1.
Nov 22 16:15:18 office ifplugd[224]: Link beat lost.
Nov 22 16:15:18 office kernel: smsc95xx 1-1.1:1.0 eth0: hardware isn't capable of remote wakeup
Nov 22 16:15:18 office dhcpcd[969]: eth0: carrier acquired
Nov 22 16:15:18 office dhcpcd[969]: eth0: IAID eb:d3:a9:ca
Nov 22 16:15:18 office dhcpcd[969]: eth0: rebinding lease of 192.168.1.12
Nov 22 16:15:20 office kernel: smsc95xx 1-1.1:1.0 eth0: link up, 10Mbps, half-duplex, lap 0x0020
Nov 22 16:15:21 office ifplugd[224]: Link beat detected.
Nov 22 16:15:23 office dhcpcd[969]: eth0: DHCP lease expired
Nov 22 16:15:23 office dhcpcd[969]: eth0: soliciting a DHCP lease
Nov 22 16:15:31 office dhcpcd[969]: eth0: carrier lost
Nov 22 16:15:31 office kernel: smsc95xx 1-1.1:1.0 eth0: link down
Nov 22 16:15:32 office ifplugd[224]: Link beat lost.
Nov 22 16:15:34 office dhcpcd[969]: eth0: carrier acquired
Nov 22 16:15:34 office kernel: smsc95xx 1-1.1:1.0 eth0: link up, 100Mbps, full-duplex, lpa 0x45E1
Nov 22 16:15:34 office dhcpcd[969]: eth0: IAID eb:d3:a9:ca
Nov 22 16:15:34 office dhcpcd[969]: eth0: soliciting a DHCP lease
Nov 22 16:15:35 office ifplugd[224]: Link beat detected.
Nov 22 16:15:46 office dhcpcd[969]: eth0: offered 192.168.1.12 from 192.168.1.1
Nov 22 16:15:46 office dhcpcd[969]: eth0: probing address 192.168.1.12/24
Nov 22 16:15:51 office dhcpcd[969]: eth0: leased 192.168.1.12 for 86400 seconds
Nov 22 16:15:51 office dhcpcd[969]: eth0: adding route to 192.168.1.0/24
Nov 22 16:15:51 office dhcpcd[969]: eth0: adding default route via 192.168.1.1
Nov 22 16:15:51 office avahi-daemon[221]: Joining mDNS multicast group on interface eth0.IPv4 with address 192.168.1.12.
Nov 22 16:15:51 office avahi-daemon[221]: New relevant interface eth0.IPv4 for mDNS.
Nov 22 16:15:51 office avahi-daemon[221]: Registering new address record for 192.168.1.12 on eth0.IPv4.
Nov 22 16:15:51 office dhcpcd[969]: eth0: removing route to 192.168.1.0/24
Nov 22 16:15:53 office ntpd[284]: Listen normally on 11 eth0 192.168.1.12:123
Nov 22 16:15:53 office ntpd[284]: new interface(s) found: waking up resolver

Hmm. The main difference between the initial (failed) attempt and the (successful) subsequent attempts is that in the latter case the ifplugd daemon fails to stop the related dhcpcd instance, so it presumably keeps going and then eventually acquires an IP address. So why doesn't that work in the former case?? Aha...
Nov 22 15:42:35 office dhcpcd[707]: timed out
That's what happens - at that point, the dhcpcd stops, and is not restarted. I wonder if I can tweak the timeout??

++++ Some time elapses... ++++

OMG, impossible to find out how the dhcpcd gets started - it IS started with -t 30, which is the actual timeout value. Let's see...

systemd +
        + /usr/lib/systemd/system/ and /etc/systemd/system/ as config files
  + /usr/lib/systemd/system/netctl\@.service

[root@office etc]# systemctl status netctl@eth0.service
netctl@eth0.service - Networking for netctl profile eth0
   Loaded: loaded (/usr/lib/systemd/system/netctl@.service; static; vendor preset: disabled)
   Active: active (exited) since Tue 2016-11-22 15:46:19 GMT; 4h 2min ago
     Docs: man:netctl.profile(5)
 Main PID: 917 (code=exited, status=0/SUCCESS)
   CGroup: /system.slice/system-netctl.slice/netctl@eth0.service
           └─969 dhcpcd -4 -q -t 30 -L eth0

Nov 22 16:15:31 office dhcpcd[969]: eth0: carrier lost
Nov 22 16:15:34 office dhcpcd[969]: eth0: carrier acquired
Nov 22 16:15:34 office dhcpcd[969]: eth0: IAID eb:d3:a9:ca
Nov 22 16:15:34 office dhcpcd[969]: eth0: soliciting a DHCP lease
Nov 22 16:15:46 office dhcpcd[969]: eth0: offered 192.168.1.12 from 192.168.1.1
Nov 22 16:15:46 office dhcpcd[969]: eth0: probing address 192.168.1.12/24
Nov 22 16:15:51 office dhcpcd[969]: eth0: leased 192.168.1.12 for 86400 seconds
Nov 22 16:15:51 office dhcpcd[969]: eth0: adding route to 192.168.1.0/24
Nov 22 16:15:51 office dhcpcd[969]: eth0: adding default route via 192.168.1.1
Nov 22 16:15:51 office dhcpcd[969]: eth0: removing route to 192.168.1.0/24

Looks like it gets started here:

Feb 22 14:40:20 office ifplugd[224]: Executing '/etc/ifplugd/netctl.action eth0 up'.
Feb 22 14:40:20 office ifplugd[224]: client: Starting network profile 'eth0'...
Feb 22 14:40:20 office dhcpcd[383]: DUID 00:01:00:01:1c:72:2b:ea:06:92:c5:a0:0d:d6
Feb 22 14:40:20 office dhcpcd[383]: eth0: IAID eb:d3:a9:ca

The script /etc/ifplugd/netctl.action says:

$1 = etho
$2 = up

. /usr/lib/network/globals
{PROFILE_DIR="/etc/netctl"
{SUBR_DIR="/usr/lib/network"
{STATE_DIR="/run/network"
{STATE_FILE="${NETCTL_STATE_FILE:-/var/lib/netctl/netctl.state}"

Substituting values...

PROFILE_FILE="/run/network/ifplugd_eth0.profile"

case "up" in
  up)
    while read -r profile; do
        if ForceConnect=yes "/usr/lib/network/network" start "eth0"; then
            mkdir -p "$(dirname "/run/network/ifplugd_eth0.profile")"
            printf "%s" "$profile" > "/run/network/ifplugd_eth0.profile"
            exit 0
        fi
    done < <(list_profiles | while read -r profile; do
        report_debug "Examining profile 'eth0'"
        (
          source "/etc/netctl/eth0" > /dev/null
which has 
    Description='eth0 connection'
    Interface=eth0
    ForceConnect=yes
    SkipNoCarrier=yes
    Connection=ethernet
    IP=dhcp

          [[ $Interface == "eth0" && $Connection == "ethernet" ]] || exit
          # Prioritize dhcp based profiles as they can outright fail, whereas
          # it is difficult to tell if a profile with a static address fails
          if [[ $IP == "dhcp" || $IP6 == dhcp* ]]; then
              : ${ExcludeAuto:=no}
              : ${Priority:=1}
          fi
          is_yes "${ExcludeAuto:-yes}" && exit
          printf "%i\t%s\n" "${Priority:-0}" "$profile"
          report_debug "Included profile '$profile'"
        )
    done | sort -nrs -k 1,1 | cut -f 2-)
    report_error "Could not start any suitable profile"
  ;;
  down)
    if [[ -e "/run/network/ifplugd_eth0.profile" ]]; then
        if ForceConnect=yes "/usr/lib/network/network" stop "$(< "/run/network/ifplugd_eth0.profile")"; then
            rm -f "/run/network/ifplugd_eth0.profile"
            exit 0
        fi
    fi
  ;;
  *)
    echo "Wrong arguments" >&2
  ;;
esac


exit 1

So - no obvious clues there... let's look at the referenced SUBR file /usr/lib/network/network:

$1 = start
$2 = eth0


# Expose the profile name
Profile=eth0
load_profile "$Profile"

which translates to

## Sources all hooks, a profile and any interface hook
# $1: profile name
load_profile() {
    local hook
    if [[ ! -r "/etc/netctl/etho" ]]; then
        exit_error "Profile 'eth0' does not exist or is not readable"
    fi
    while read -r hook; do  Nothing in the hooks directory!!
        source "$hook"
    done < <(find -L "/etc/netctl/hooks" -maxdepth 1 -type f -executable -not -name '.*' -not -name '*~' | sort -u)
    source "/etc/netctl/$1"
    if [[ -z eth0 ]]; then
        exit_error "Profile 'eth0' does not specify an interface"
    fi
    if [[ ! -r "${ethernet:+/usr/lib/network/connections/ethernet}" ]]; then
        exit_error "Profile 'eth0' does not specify a valid connection"
    fi
    if [[ -x "$PROFILE_DIR/interfaces/eth0" ]]; then
        source "$PROFILE_DIR/interfaces/eth0"
    fi
    source "/usr/lib/network/connections/ethernet"

which contains


. "/usr/lib/network/ip" - contains some DHCP-invoking code!!

which we want to end up with a result like
dhcpcd -4 -q -t 30 -L eth0

dhcp_call "${DHCPClient:-dhcpcd}" start 4

## Interface a DHCP client
# $1: DHCP client - dhcpcd
# $2: command     - start
# $3...: additional arguments - 4
dhcp_call() {
    local client="dhcpcd" command="start"
    shift 2

    if [[ ! -r "/usr/lib/network/dhcp/dhcpcd" ]]; then
        report_error "DHCP client 'dhcpcd' is unsupported"
        return 127
    fi
    if ! source "/usr/lib/network/dhcp/dhcpcd"; then
        report_error "DHCP client 'dhcpcd' is not installed or not ready"
        return 127
    fi
    "dhcpcd_start" "4"

}

The file /usr/lib/network/dhcp/dhcpcd contains:

dhcpcd_start() {
    local options
    case $1 in
      4) options="$DhcpcdOptions -L";; - guess we're going for IP4!
      6) options=$DhcpcdOptions6;;
      *) return 1;;
    esac
    if [[ $2 == "noaddr" ]]; then
        report_error "Using 'dhcpcd' for configuration without address assignment is unsupported"
        return 1
    fi
    # If using own dns, tell dhcpcd to NOT replace resolv.conf
    [[ $DNS ]] && options+=" -C resolv.conf"
    if ! do_debug do_readable dhcpcd -$1 -q -t "${TimeoutDHCP:-30}" $options "$Interface"; then
        report_error "DHCP IPv$1 lease attempt failed on interface '$Interface'"
        return 1
    fi
}

Added a line to /etc/netctl/eth0:
TimeoutDHCP=60
Let's see if that does the trick! It's also suggested in https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=46348

ethernet_up() {
    if ! is_interface "$Interface"; then
        report_error "Interface '$Interface' does not exist"
        return 1
    fi

    # Disable IPv6 before bringing the interface up to prevent SLAAC
    if [[ $IP6 == "no" ]]; then
        sysctl -q -w "net.ipv6.conf.${Interface/.//}.disable_ipv6=1"
    fi

    if ! bring_interface_up "$Interface"; then
        report_error "Failed to bring interface '$Interface' up"
        return 1
    fi

    if is_yes "${SkipNoCarrier:-no}"; then
        SkipDAD=yes

    else
        # Some cards are plain slow to come up. Don't fail immediately.
        if ! timeout_wait "${TimeoutCarrier:-5}" '(( $(< "/sys/class/net/$Interface/carrier") ))'; then
            report_error "No connection found on interface '$Interface' (timeout)"
            bring_interface_down "$Interface"
            return 1
        fi
    fi


    if is_yes "${Auth8021X:-no}"; then
        . "$SUBR_DIR/wpa"
        : ${WPAConfigFile:=/etc/wpa_supplicant.conf}
        : ${WPADriver:=wired}
        : ${TimeoutWPA:=15}

        if ! wpa_start "$Interface" "$WPADriver" "$WPAConfigFile"; then
            report_error "The WPA supplicant did not start for interface '$Interface'"
            bring_interface_down "$Interface"
            return 1

        fi

        if ! wpa_wait_until_state "$TimeoutWPA" "$Interface" "ASSOCIATED"; then
            wpa_stop "$Interface"
            bring_interface_down "$Interface"
            report_error "WPA Authentication/Association Failed"
            return 1
        fi
    fi

    if ! ip_set; then
        stop_8021x
        bring_interface_down "$Interface"
        return 1
    fi
}



}


case $1 in
  start)
    report_notice "Starting network profile '$Profile'..."
    if is_interface "$Interface" && interface_is_up "$Interface" && \
       ! is_yes "${ForceConnect:-no}"; then
        report_error "The interface of network profile '$Profile' is already up"
        exit 1
    fi
    if ! "${Connection}_up"; then
        report_error "Failed to bring the network up for profile '$Profile'"
        exit 1
    fi
    # JP: sandbox the eval
    if ! ( eval $ExecUpPost ); then
        report_error "ExecUpPost failed for network profile '$Profile'"
        # Failing ExecUpPost will take the connection down
        "${Connection}_down"
        exit 1
    fi

    report_notice "Started network profile '$Profile'"
;;
  stop)
    report_notice "Stopping network profile '$Profile'..."
    # JP: sandbox the eval
    if ! ( eval $ExecDownPre ); then
        report_error "ExecDownPre failed for network profile '$Profile'"
        # Failing ExecDownPre will leave the profile active
        exit 1
    fi
    if ! "${Connection}_down"; then
        report_error "Failed to bring the network down for profile '$Profile'"
        exit 1
    fi
    if is_interface "$Interface" && interface_is_up "$Interface" && \
       ! is_yes "${ForceConnect:-no}"; then
        report_error "The interface of network profile '$Profile' did not go down"
        exit 1
    fi
    report_notice "Stopped network profile '$Profile'"
  ;;
esac


No comments:

Post a Comment