This seems to be one of those 'too good to be true' situations, how accurate do you think their conclusion is?
Me, I'd put good money on "not accurate at all". At least, not as folks generally understand "self aware". There have been beat-ups about "self-aware" strong AI algorithms since the Eliza program in the 1960s. As a teen growing up I read Joseph Weizenbaum's "Computer power and human reason" - the wikipedia article doesn't do it justice, a lot of his argument was that weak AI, i.e. building intelligence along classical programming lines, with codified rules and algorithms, is reasonably straightforward; but writing an algorithm to simulate understanding a set of logic rules is quite different from strong AI, where you are trying to build something that is more like human consciousness. From some digging it seems the core algorithm here is Deontic Cognitive Event Calculus which looks very cool as a way of codifying and modelling logic and self-awareness and the like. But there's a huge gap, in my mind at least, between being able to model and simulate self-awareness, and actually being self-aware as most people would understand it.
Just wanted to point out. Modern AI (at least the machine learning subfield) is not based on hard coded logic rules. Instead, it's based on systems that observe data and automatically learn functions that model that data and generalize to unknown data. We've come a long way since Eliza, but we're still far from strong AI.
If machine learning interests you: This website has a number of interactive toys that demonstrate the deep neural network set of algorithms. This one is my favorite. Udacity also has some fantastic free courses on machine learning. In general, if you're interested in AI, you'll have to look past the media. Books and journal papers are your best resource, and there are a few good video courses.
So, apparently the bots are supposed to determine which of them can speak. Did you notice that the only bot to perform any action stood up THEN spoke? If they were all running the same software, logically, each bot would stand up, then attempt to speak. This is ridiculous on so many levels. Even if every bot was running the same software, and only one recognized its ability to speak, unless it learned this fact from a state of 0 knowledge (as opposed to being programmed to recognize a pre-programmed sound clip, then speak another sound clip), it is no more intelligent than a microwave.
I can't figure if there's any point or purpose in having self-aware robots. Also I'm not sure what a self-aware robot is supposed to be, or what that even means. And I can't tell in which order should these questions be tackled. Automatons sensing whether they what they say can be heard is not exactly new, that's the core of CSMA but network switches are not considered self-aware. I agree that for an Automaton to be able to answer the question sense it had answered and correct the answer accordingly is impressive, but I'm king of iffy on what that has to do with self-awareness.
It's really fun to speculate about. Need more proof and more scenarios of self awareness.