Project „New Search Engine“
Methodology (metrics) of relevance, i.e. determination of good and bad links

Motto: I cannot determine a priori that a link is good. But I can determine that a link is bad. And thereafter good links are such links, which are not bad.

Good (correct) links are links to relevant WWW pages, i.e. to quantitatively (reasonably) big a qualitatively good pages, which correspond to the searched word.

Bad (wrong, incorrect) links are links to: more general pages, non-corresponding pages, small or low-quality pages, less general pages or duplicated pages.

In tolerance are links, which:
- lead to more general page, where is the searched word one of several basic words
- lead to less general page, but big and high-quality (home page of a server to given theme)
The main reason of this tolerance is, that the links are evaluated in the either-or way (good-bad).

When searching the word „ships“, more general page is about transportation vehicles, non-corresponding page is about cars, small or low quality page contains two sentences and images of ships (moreover wrong), less general page is on the theme „the history of ship construction in Argentina“, duplicated pages are main page and subpage of the same site or double pages concerning the same object.
In tolerance is link which leads:
- to server about transportation vehicles, that contains the parts cars, planes and ships, i.e. where ships are majority or important part
- to big server about trade ships

If the searched word has several meanings, then the opinion of the majority of people is decisive (the meaning, by which the word is recognized by majority of people).

From found links I determine bad links. The rest of links are good.

This methodology is objective. It considers the link from the viewpoint “how many users find it relevant“, i.e. to relatively exact evaluation of the quality of the order of links.

The most frequent error of search engines is placing of less general pages to the first page of found links (in front of relevant more general pages).

Here I am dealing with internal quality (errors) of search engines, not with spam (external influences).