That's a really good question. Size, obviously, but that in itself is not very interesting. From their size and other cues, I'd say I'll first notice their sex, their age being a very close second. What will register after that is their apparent health. Then their social status (clothing, demeanor). All that happens in a few seconds.
When I have observed their facial expressions and gaze, I'll have a very rough expectation of how smart they will sound. The first words they utter will further narrow down that estimate. In a pretty short time, I will have them pigeonholed in my mind. Anything turning up later that contradicts the above will register as a surprise and is probably readable from my reactions by an observer.