Important new text added at end after 2 hour delay.
Well:
1) their brain doesn't have to be in their body and
2) nothing gets smaller faster than highly integrated electronics.
Experience can be as fast as you like. Listen to a human talk at 100x normal speed, for example.
True the AI of the robot could be separate from the robot's body, But I still don't want him near me, as I have often seen communication interference on TV or even total drop of signal a few times. If I'm sick in hospital bed, I want a good looking young female nurse washing me. Not some robot, whose brain is the basement.
On (2) Yes much smaller still is possible, but my limiting point was the density of heat generation, not circuit volume. I forget the details but there is in information theory for any non-reversible logic machine a least possible energy loss for each bit switched. It is related to fact that if the whole process is not reversible (I.e. can not start with the decision robot makes and in then reverse his calculation to discover the "inputs" it was based on) Entropy will be increased as it does in ANY non-reversible process that makes heat. Only way a process can avoid producing heat is if it is 100% reversible.
Yes it is true that humans can understand speech much faster than they can make it. But only by a factor of about 20, not 100, for extended speech (more than a few seconds) if comprehension is not seriously degraded, as I recall. (I think you can understand a sentence or two of faster speech as you don't really do it directly. I.e. you can hold it in "short term memory" and "play it back" more slowly.)
To even double that "times 20 rate" the speech must be pre-process in a computer to greatly reduce the duration of the vowel sounds and several other tricks
Natural speech spend much more time making the vowel sound than is needed.
Closely related is fact that with computer's rapid display of short segments of the text to be read at the center of the screen where reader keeps his fixation point, at least a 10 fold faster* reading with full comprehension is possible. This is because in normal reading of printed text about 90% of the time the mind totally ignores the motion blurred image on the retina when you make a saccade to the next fixation point.**
There are some very interesting experiments related to this and fact saccades are ballistic. Modern (and the best 40 years ago) eye motion detectors permit the next fixation point to be accurately predicted by time only 10% of the saccade has been done. Thus the next segment of text to be read can be displayed there and all the rest of the screen be filled with rapidly changing letters in the book's lines of text. The reader does not even notice this, but you looking at same screen and not knowing where each fixation will be, can't read anything - You just see a full page of rapidly changing letters. It is not uncommon for the reader to say: "I'm ready - lets get started with the experiment." When in fact the screen has been displaying scrambled letters everywhere except for the short line (less than 2 inches long) of three to five words where he is making a saccade to for several minutes!
* This is (or was) called rapid sequential viewing (or reading). Much of the early work more than 35 years ago was done at JHU Braddock visual center, and I often visited the lab doing it for about a month but the good looking female technician running the tests was not the least bit interested in me.
For reasons not clear then (perhaps still not) there must be one blank (dark a I recall) frame projected on the screen before the next text image is. They had a special "refresh faster than TV " computer driven monitor but I forget what was its framing rate. It was faster than most people tested but for one guy, they could not discover his max reading speed - He had full comprehension of what he read when every frame of text was shown briefly only once by a single frame! I. e. he could read well new text at half the special monitors refresh rate.
** You can easily demonstrate that the eye motion blurred image on retina is totally ignored: Put your nose near a mirror and look at your left eye, then look at your right eye. To do that with head still, you had to "swing" you direction o gaze -rotate your eyeballs, but you will never see their rotation - that retinal data is totally ignored.
BTW all the above is pulled from more than three decades of storage in my memory - I rarely search for information else where - too much crap in the internet. If you search and find some error in the above, please tell me. My fantastic memory for an old man still has the "rewrite function" operational.