You are currently browsing the category archive for the ‘search engines’ category.

My last post left off by asking readers to play 20 questions using people as the intended objects and then, reflecting on how that unfolded, read about the Frame Problem – a much discussed and debated issue in both computer science and contemporary philosophy.

Before I get into what I believe to be the applications of the Frame Problem to today’s search technology paradigm, I will go back to the thread of “properties” to which I promised you I would return.

Remember the “properties” of George Bush that we discussed – properties such as “IS_FUNNY”, and “IS_FORMER_PRESIDENT_OF_U.S.” – were things that the search engine did not understand and could not use to help the user find more “useful” results despite finding results that were, technically, “relevant” to “George Bush”.
To show the importance of properties in general information retrieval (and now I am going far beyond just search technology), try playing 20 questions again as if you were a typical search engine. Someone would start the game with a person in mind. You would be tempted to say something like “Is this person in the news?” or “Is this person female?”. But things like “HAS_GENDER” and “IS_FAMOUS” are properties, aren’t they? So you can’t do that. If you were a search engine, all you could do is blindly throw out contexts where you had encountered a “person” in the past – definitions, lists of synonyms etc. You could only distinguish on the basis of frequency (or more precisely features) of occurrence. Now, you are never going to get anywhere in 20 Questions this way, are you?? And this is why search engines that can’t distinguish properties don’t get you useful results even though what they produce may be relevant or “popular”.

All of this is to tie in with the notion of the Frame Problem. This problem, as I mentioned before, is a long-discussed and disputed problem related to artificial intelligence and philosophy. But really it is very relevant not just to search technology, but to the very activity of search in general – the idea of task completion, really. So, your “task” in 20 questions is to guess the identity of a person, place or thing within a certain number of tries, and to complete this task as efficiently as possible (and “win” the “game”) you must have a strategy. The importance of a “strategy” in completing any task – from supplying search engine users with good results to winning 20 questions – cannot be overlooked. In fact, if you read Daniel C. Dennet’s seminal work on the Frame Problem (See Dennett, D.C. 1984. Cognitive wheels: The frame problem in artificial intelligence. In Hookway, Minds, Machines and Evolution, 129—151), you will quickly learn how much knowledge is required just to make a turkey sandwich! The frame problem is really about “framing” the knowledge required for task completion so that it does not involve either too much or too little data. For example, there are all kinds of data points that a human being processes when making a turkey sandwich but only a subset of them are relevant to the completion of the task – so for example you maintain the knowledge that refrigerators keep things cold but you don’t really need to draw on that knowledge to make your sandwich, do you? So effective task completion involves not just knowing how to do something but using the right knowledge at the right time.
I will leave off with a Google search for Toyota – which has at least three possible referents – an organization, a product manufactured by an organization and a place. Google is able to separate genre pretty well – that is, it has news separated from Wiki pages, separated from Twitter feeds. So while genre recognition is indeed getting closer to notions of “utility” and salient contextual knowledge in our search technology it falls short of truly recognizing properties of entities.

More next time….until then check out Dennett 1984 and this time think of how to program a robot to be good at 20 Questions!