My last post went over an ESP8266-based wifi gateway for Milight/LimitlessLED bulbs. This supports a few kinds of bulbs that have been around for a couple of years (e.g., this one).
About a year ago, newer bulbs and controllers started showing up that used a different 2.4 GHz protocol. This introduced some scrambling that made it difficult to emulate many devices. This was presumably done intentionally to prevent exactly the sort of thing that my last project accomplished (boo!).
The new bulbs actually have some really cool features that none of the old ones do, so there’s some incentive to figure this out. In particular, they support saturation, which allows for ~65k (2**16) colors with variable brightness instead of the 256 colors that the old one does. They also combine RGB and CCT (adjustable white temperature) in one bulb, which is super cool.
A few others have dug into this a little, but as far as I’ve been able to tell, no one has figured out (or at least shared) how to de-scramble the protocol. I think I’ve managed to do so. I should mention that I don’t have much experience doing this kind of thing, so it’s entirely possible the structure I’m imposing is a lot more complicated than what’s actually going on. But as far as I’ve been able to tell, it does work. I’ve tested with five devices – four remotes and one wifi box.
I’m going to start by detailing the structure, and I’ll follow up with some of the methodology I used to reverse the protocol.
Differences from old protocol
From a quick glance, there are a few superficial differences between the new and old protocols:
- Listens on a different channelset (this was true of different bulb types among the old bulbs too). The new bulbs use channels 8, 39, and 70.
- Different PL1167 syncword. It uses
0x7236 ,
0x1809 .
- Packets have 9 bytes instead of 7.
- Packets are scrambled. The same command can look completely different.
The scrambling is the tricky part. As others who have stared at packet captures noticed, when the first byte of packets for the same command is held fixed, most of the other bytes stay fixed too. This suggests that the first byte is some kind of scramble key. Turns out this is the case.
Example packets for turning group 1 on with one of my remotes:
|
A7 45 E9 6F BB 99 9E CF 4C E1 D6 4F 79 99 78 16 9A A7 23 D1 55 03 47 25 2E 5B E8 92 18 18 33 42 15 60 02 CE CA E0 E0 EB 0A DD A1 CA 9F |
Notation
I’ll for a packet p, I’ll use pi to refer to the 0-indexed ith byte in the packet. For example, p0 refers to the 0th byte.
I’ll use p’ to refer to the scrambled packet for a packet p.
Structure
The 9 bytes of the packet are:
p0 |
p1 |
p2 |
p3 |
p4 |
p5 |
p6 |
p7 |
p8 |
Key |
0x20 |
ID1 |
ID2 |
Command |
Argument |
Sequence |
Group |
Checksum |
Packet scrambling
The designer of this protocol added in quite a few things to complicate reversing it. None of them are particularly hard on their own, but with them all added together it makes it pretty tough.
The scrambling algorithm is basically:
- A scramble key k is computed from p0
- Each byte position i has a different set of four 1-byte integers A[i]. Integer A[i][j] is used when p0 ≡ j mod 4.
- A[i][j] is up-shifted by 0x80 when p0 is in the range [0x54, 0xD3]. This does not apply to the checksum byte.
- p’i = ((pi ⊕ k) + A[i][p0 mod 4]) mod 256, where ⊕ is a bitwise exclusive or.
The algorithm to compute k is as follows (in ruby):
|
def xor_key(p0) # Generate most significant nibble shift = (p0 & 0x0F) < 0x04 ? 0 : 1 x = (((p0 & 0xF0) >> 4) + shift + 6) % 8 msn = (((4 + x) ^ 1) & 0x0F) << 4 # Generate least significant nibble lsn = ((((p0 & 0xF) + 4)^2) & 0x0F) msn | lsn end |
A values:
Position |
0 |
1 |
2 |
3 |
p1 |
0x45 |
0x1F |
0x14 |
0x5C |
p2 |
0x2B |
0xC9 |
0xE3 |
0x11 |
p3 |
0x6D |
0x5F |
0x8A |
0x2B |
p4 |
0xAF |
0x03 |
0x1D |
0xF3 |
p5 |
0x5A |
0x22 |
0x30 |
0x11 |
p6 |
0x04 |
0xD8 |
0x71 |
0x42 |
p7 |
0xAF |
0x04 |
0xDD |
0x07 |
p8 |
0x61 |
0x13 |
0x38 |
0x64 |
There are probably actually several possible values for some of these. It really only matters that they line up in a particular way because of the checksum.
In addition to all of this, command arguments have different offsets from 0, and some commands (i.e., saturation and brightness) have the same p4 value with arguments spanning different ranges. For example, arguments for brightness start at 0x4F. Arguments for color start at 0x15.
Checksum
The checksum byte is calculated by summing all bytes in the unscrambled packet except for the first (scramble key) and last (checksum), and k + 2.
Code
Further detail is probably easier to communicate in code, so here is a ruby library that can encode/decode packets. The project on github also has a ton of packets I scraped from my devices.
Methodology
I scraped a ton of packets with this script.
Figuring this out was mostly making assumptions, pattern recognition, and trial and error. The most helpful assumption was that sequential values for arguments in the UDP protocol had sequential values in the 2.4 GHz protocol (this turned out to be true).
I noticed that packets that had p0 values with the same remainder mod 4 followed a nearly sequential pattern. It often looked something like 0xC, 0xD, 0xA, 0xB, etc. A sequence follows this pattern when it’s xored with a constant. I wrote some cruddy ruby methods to brute-force search for constants that yielded the sequence [0, 1, …, N]. This also allowed me to find the values for the As.
Roughly the same process was repeated for each byte. Bytes that are constants were trickier because they didn’t follow a sequence. I instead brute-forced values for A given a sequence of xor keys.
Next Steps
I’ll be porting over the scrambling code to my ESP8266 milight hub to add support for the new bulbs.
UPDATE 2017-03-28: A few kind volunteers sent me packet captures from their devices, and the ID bytes were not staying fixed under decoding. Assuming my methodology is right, these should be the right values for all parameters, with the possible exception of p1 and the checksum byte.
UPDATE 2017-03-20: I found that the wifi box I was testing with supported older protocols, which transmits the unscrambled device ID. There were several possible values for the ID byte offsets, and I chose a few of them arbitrarily. The decoded ID in the scrambled protocol was not matching the ID in the unscrambled protocol. Updating the additive offset values fixed this.