Friday, 5 August 2016

iPhone Unifi DHCP issue

This is technical.

For a long time now my son James, and I, have been cursing Apple. We have iPhones, and iPads and all sorts, but we keep finding the WiFi not working in the house.

To explain the symptoms, in the morning I normally get up and have a bath and use my phone in the bath to check Facebook and twitter and so on, and every damn day I find my phone stops working in some way on the WiFi. Basically Facebook shows blank panes and not loading or things don't show new stuff. I have to go to WiFi settings and I see a 169 IP address (no response to DHCP standard address) as per the image on the right. The fix is WiFi off/on or airplane mode on/off. Sometimes I have to do this two or three times. It pisses me off.

I have done a lot of work on this. The set up is as follows :-
  • iPhones and iPads, latest code, James has beta code
  • Unifi APs, latest code, three of them, all same network and SSID
  • FireBrick doing DHCP
All of these are pretty solid systems, and should not screw up like this. I did loads on the FireBrick DHCP (seeing as I wrote it) trying everything I could think off - tweaking the TTL on responses, changing lease times, all sorts. Nothing helped.

I have pinned down that the problem happens on change of AP, when I get in the bath I am between APs and it moves from one to the other, and that is when it loses DHCP. To add to the fun, it has IPv6 working OK but not IPv4, so extra special.

Firstly, this shows a clear bug in the Apple code - there is "WiFi Assist" to handle poor wifi (and use mobile), but that does not kick in when you have working IPv6 and no reply on IPv4 DHCP. It knows we have no IPv4 on WiFi (hence 169 address). Maybe it should, at least, use mobile for IPv4 traffic!

But my packet dumps suggest no attempt to get DHCP on these cases. We see IPv6 working, but IPv4 packets to the phone, and ARPs to the phone do not work, and there are no DHCP requests.

James spotted the broken "WiFi Assist" and we tried turning that off, but sadly no joy. It is not 100% reproducible, so hard to be sure we have found a fix, sadly. But this did not work.

So what next? Is it an Apple bug, or a Unifi bug, or even a FireBrick bug? So we tried some more.

In some degree of desperation we set all three APs to different SSIDs. This is to see if that works, but the problem is the phone sticks to the wrong SSID even with really low signal as we move around the house. Yes, when/if it changes AP it gets a new IP, but it sits on the crappy signal for ages. It does not switch until it totally loses signal. Bugger.

So what next. Well, I have made the cardinal sin of changing two things at once, and this seems to be working. If I can, I'll update and confirm the diagnosis with one change later.
  • I set up the SSID we are using only on 5GHz not 2GHz
  • I changed the Unifi to consider all three APs to be separate Wireless Networks that happen to be on same LAN (VLAN not set) and happen to be same SSID, rather than one network on all three APs
I have tried many times to break it now, and when moving from one AP to another the phone quickly re-associates with the nearer AP and does not lose its IP addressing at all. It seems to be working! The real test will be the bath, tomorrow.

So, jury still out on if an iPhone issue of a Unifi issue. But we have working WiFi at last.

When we come to reporting this to one or other of them, we will not get far, I am sure.

Update: In spite of all my testing, it just went wrong again, arrrrg! What is worse is that it just sits there with a 169 address not even trying.

Update: I tried changing all the APs to actually be on the same channel, and setting a min rssi. That seemed to help in that nothing broke this morning. We'll see how it goes for a few days. Several of the comments suggest this is definitely an Apple issue but at this stage I am trying to work around it.

Update: The 5GHz has helped, but sadly, today, it failed again, so trying same SSID and channels did not help - next step is try different SSID and set min-RSSI

Update: Using different SSIDs does fix this, and setting min RSSI means they do switch. Not ideal. I tried working our Enterprise WPA but not worked out the RADIUS responses I need yet (If anyone has a pcap that would be helpful). One other simple thing to try would be fixed IP rather than DHCP.

32 comments:

  1. I've had the exact same thing with a Macbook Pro and iPhone using BT Hub/Plusnet Hub but "fixed" it by changing the SSID of the 5GHz too. The Apple stuff just doesn't like switching between the same SSIDs using the different bands.

    ReplyDelete
  2. I'm not sure about the unifi team but the edgeOS team at UBNT are pretty good at taking bug reports if you are able to show them how to reproduce it.

    ReplyDelete
  3. Do you have Zero Handoff enabled? Might it be something to do with that?

    ReplyDelete
    Replies
    1. It won't let me enabled that - as it was one of the things I wanted to try. It is their latest models.

      Delete
    2. I think, when I tried it, I had to create a grouping other than "Default" to enable Zero Handoff mode on the selected APs. Or something like that.

      Delete
    3. Tried that, lets me make the group but then cannot assign to these APs

      Delete
    4. You don't really want to use zero handoff. Honestly, you don't. Unless you need to use it for voip whilst wandering around your house, use the min RSSI feature to tune your AP connectivity.

      Delete
    5. They dropped ZHO from the newest AP's (you will see it mentioned but it doesn't work) P.S you don't want to use ZHO anyway as per Fuzzycat's note.

      Delete
  4. I have the same issue but it does not just occur on wifi clients. Rebooting my firebrick fixes it. In my situation it seems to occur when the house is unoccupied for a period of time, I can only get a dhcp address after rebooting the brick. I did not have the problem when I was running pf sense as the dhcp server. I have been meaning to ask support why the brick seems to stop doing dhcp after a period of time.

    ReplyDelete
    Replies
    1. That is likely to be a different issue. In this case no DHCP packets reach the brick so not a brick issue. Happy to look in to your issue. Make sure you are in latest s/w anyway.

      Delete
  5. We see exactly the same problem with iOS devices on Ruckus wifi kit (at multiple customers), but only when using WPA Enterprise auth. I've seen the same thing reported on Cisco APs too and I'm pretty sure I saw a thing from Ruckus basically saying it's a known iOS bug (unreliable roaming). Been going on for ages, affects multiple versions of iOS, clearly Apple doesn't care.

    Unfortunately not much we can do - our company policy these days is to not report bugs to Apple. The reason being that we have reported numerous bugs over the years and the response is always the same: they tell us to do some standardised debugging that takes hours to do and produces no information over and above what we already provided them with. So we do that and not once have they actually bothered to fix a bug we've reported. So the upshot is we concluded it was a complete waste of our time and have given up filing bug reports with Apple.

    ReplyDelete
  6. Just to comment on the AP's holding onto the signal till it dies, you can force ap switching by setting "min rssi" on the ap.

    http://imgur.com/jsglhaM

    ReplyDelete
    Replies
    1. Maybe we need to try that then.

      Delete
  7. The answer is to use wpa enterprise. You know the one that requires a radius server. You'll find the hand off between AP's just works so much better.

    ReplyDelete
    Replies
    1. Pooh another thing to try and I can do RADIUS.

      Delete
  8. Have you tried changing the Min RSSI (under 'Radios') on the AP's so that you get kicked off the weaker signal ones quicker?

    ReplyDelete
  9. This morning it has not broken. After it broke last night I put all the APs on the *same* 5GHz channel!

    ReplyDelete
  10. Our home network (IPV4, one SSID, two channels, 2.4GHz) used to have an issue with handoff when two of the APs were cheap crappy TP-Liink.
    Since they're all now MikroTik, no problems.

    ReplyDelete
  11. I have had something similar with certain Android devices and APs named the same on both 2.4 ghz and 5ghz. It seems the stack gets confused by seeing the same AP name, but roaming between frequencies on the same AP. I also found in my case that the DD-WRT firmware performed better than the original TP link software.

    ReplyDelete
  12. How do we put political pressure on Apple to make them start to care? We need to somehow increase the embarrassment and pain.

    ReplyDelete
  13. +1 for Mikrotik. Have roaming working well with capsman & forced roaming using access list and min rssi plus lowered TX power on AP's

    ReplyDelete
  14. I have a pair of Airport Extremes (4th and 5th generation). One creates the network (different SSIDs for 2.4 and 5GHz), the other extends the 5GHz network over wifi for upstairs onto the cabled section of my network. Nearly all my traffic to my broadband goes over this wifi extended link, it gives very little trouble. My iPad Air2 is on the 5GHz SSID and gives no trouble at all, even though it moves between the two Extremes when I go between upstairs and downstairs.

    The only thing this setup does that I don't like is the upstairs Extreme doing the extending creates an extra 2.4GHz network using the 5GHz SSID and on a channel I can't control. When an Extreme is set to "Extend a wireless network" you get no control. I'd like to switch the bonus 2.4GHz network off.

    So to return to the point, it seems the iPad Air2 has none of the problems people are describing when it is used witn Apple's own APs, at least in "Extend a wirelsss network" mode when they're both on the same channel number. And this Air2 has mobile data with a Three SIM in it, so it's nothing to do with mobile broadband not being on the device.

    ReplyDelete
  15. I have a pair of UAP-AC-Pro units and have never had this issue. Mine are set up like so:

    2.4GHz and 5GHz on different channels, e.g. 1 and 6 for 2.4GHz and 36 and 52 for 5GHz.

    2.4GHz set to HT20 only, 9 dBm transmit power.
    5GHz set to HT80, 20 dBm transmit power.
    Avoid setting Min RSSI.

    Disable both Band Steering and Airtime Fairness.

    All my networks are WPA-PSK, set up as WPA2 Only and AES/CCMP Only.

    I am also currently running a Unifi beta, which is behaving rather well. I'd be sorely tempted to give it a go if I were you. You definitely want Unifi 5.x though.

    ReplyDelete
  16. https://support.apple.com/en-gb/HT203068 is relevant to your interests - it documents how iOS chooses to change AP, if you're using WPA Enterprise.

    ReplyDelete
    Replies
    1. Thanks for this, before I read that I wasn't even aware there was a 802.11k, r or v.

      Delete
  17. Is there a way I could be notified if a fix or workaround is found? I'm considering buying a pack of Ubiquiti kit from AA, but I don't dare go ahead as I'm reliant oniPads and iPhones and the whole point of the Ubiquiti multi-AP thing is to just work, seamlessly. Chris Boot describes a successful config. WPA enterprise was mentioned too, but I don't have the server tech needed to support it. Could AA publicise the state of play / progress on this issue?

    ReplyDelete
    Replies
    1. If I find more I'll post on here. So far I think I can say that difference SSIDs does just work. I am also thinking of trying fixed IP which is simple enough on an iPhone. Enterprise will take a tad more to set up.

      Delete
  18. I noticed this. it's very annoying. better positioning of the APs to optimise signal overlap and reducing RSSI to force switching helped but not enough.

    ReplyDelete
  19. Hi. Same issue here. Using TPLink1043 ND V3 ddwrt firmware as a main router. I have another router same model and FW, connected with the main router LAN - LAN connection with cable for more coverage of wifi signal. All clients working well on wifi only Iphone is loosing wifi signal. In IPhone settings I see that it receives from the router wrong IP 169.254.x.x. Subnet mask 255.255.0.0. These are wrong numbers. My dhcp is set to 192.168.0.100-150 with 255.255.255.0 subnet. Also I use WPA2 PERSONAL. I tried many things that was previously mentioned above in this forum but no solution yet. However I did not try the Wpa enterprise yet. Is that solution working? If so is there any suggestion how to set up?

    ReplyDelete
  20. Update: I tried to disable wireless security. My iPhone still looses wifi connection after short period of time. Still no solution yet.

    ReplyDelete
  21. Hey guys,

    UBNT-Brandon here. Have you tried our most recent stable release?

    https://www.ubnt.com/download/unifi

    Thanks,
    Brandon

    ReplyDelete