I played with this technique a few years ago when it was doing the rounds, too. I read an academic paper that proposed a method and worked from that (although I hear that Jon Thompson's guide is far easier to follow), with moderate success. The challenge comes from finding images that can be meaningfully combined: faces tend to work well (and for the most-striking effect, you should try to use subjects of different genders; I could never achieve much success with different races, though), perhaps because of the mind's ability to find faces in anything, but anything else needs a little thought.
Consider, for example, this bicycle/motorcycle (
http://cvcl.mit.edu/hybrid_gallery/moto_bike.html) on the MIT hybrid images site. It works because of the shape similarity between the motorcycle and the shadow of the bicycle (and it's emotionally striking too because, hey: who's not cycled and wished they were on a motorbike instead, at some point or other?). If the artist had opted to merge a bicycle with a face, they'd have a far harder time.
As Lady of Mystery noted, text can often be done with greater reliability: a major reason for this is, I think, that the "fuzziness" around the edge of where the 'invisible' words are is quickly assimilated into the background of the image and thus ignored by the mind - like a stereogram: all the information is there, it's just that our brains are wired to see different things at different times.