WO2008023994A1 - Method of targeting messages - Google Patents
Method of targeting messages Download PDFInfo
- Publication number
- WO2008023994A1 WO2008023994A1 PCT/NO2007/000301 NO2007000301W WO2008023994A1 WO 2008023994 A1 WO2008023994 A1 WO 2008023994A1 NO 2007000301 W NO2007000301 W NO 2007000301W WO 2008023994 A1 WO2008023994 A1 WO 2008023994A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- search
- messages
- database
- domains
- user
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
Definitions
- the present invention relates to a method and a system for targeting messages.
- the invention describes a method and a system for providing messages to users of a vertical search system based on user input.
- a wide variety of services are currently being provided electronically to users of the Internet. Many such services involve receiving information from a user and performing a search for a publication corresponding to the information received from the user.
- a publication can be a particular web page, a reference to a printed publication, a description of goods for sale, a record in a database, or almost any piece of information that can be referenced and that can be identified in some meaningful manner based on the search input from the user.
- Providers of search services may often want to provide a user with additional information when the results of the search are presented. If the search is for a particular type of item in an online catalog of goods, the provider may want to present information regarding similar types of goods, and if the search is for a web page describing a geographical location, the search provider may want to present an advertisement for a travel agency or an airline servicing that location.
- a vertical search is a search for information that has already been classified as belonging to some domain, such as a particular business or profession.
- a vertical search service may, according to the invention, provide a first database of searchable references to publications, where the references are classified as belonging to at least one of a plurality of domains.
- the various domains represent different vertical fields of search.
- Such a service may then provide a second database of messages that are specified as being relevant to at least one of the domains.
- a request to perform a vertical search is received by the service it will include a specification of which domain to search.
- a message may then be selected based on the specification of domain. This message may then be transmitted back to the user along with the results of the search.
- the invention further provides for a specification of the various domains, the classification of messages, and the design of the user interface presented to users to be complementary, such that users are encouraged to provide information about themselves, their profession, hobbies, specialties etc in a manner that facilitates efficient searching as well as well targeted selection of messages.
- a request to perform a search is received in response to a presentation to the user of a user interface where the user can select one of a plurality of domains, enter one or more search criteria and cause said selected domain and said search criteria to be transmitted.
- the messages may include one or more media types, such as text, image, video, animation and audio.
- additional identifiers associated with the messages can be used to select not only based on domain, but also by an evaluation of correspondence between the search criteria provided by the user and the additional identifiers.
- the messages may include a component with which a user can interact when the message is displayed on a client computer.
- a component may be a hyperlink or some similar device allowing the user to interact with the message.
- a short message may be transmitted back to the search service where a record of received tokens are kept. Such a record can be used to generate statistics of how often the various messages have been displayed and caused a user to interact with them.
- the present invention provides a method and a system for providing the various features and advantages of the invention, as specified in the appended claims.
- FIG. 1 illustrates a system consistent with an embodiment of the invention
- FIG. 2 illustrates in a flow chart, a crawling and indexing process
- FIG. 3 illustrates a process of entering messages in a targeted messages database
- Fig. 4 illustrates a process of performing a vertical search and retrieving search results and targeted messages
- Fig. 5 A and Fig. 5 B show a tool bar used as a search form for performing a search.
- a new vertical search system may be based on a combination of controlled crawling, classification and domain specific indexing. This may make it possible to selectively seek out and index pages that are relevant to a professional domain, or some other domain defined by how the pages relate to a pre-defined topic or set of topics.
- the topics may be specified by keywords, or alternatively by using exemplary documents.
- the vertical search system indexes pages that are likely to be most relevant for a particular domain and avoids irrelevant regions of the web. (It should be understood that domain here refers to the topical domain defined by a profession, a hobby, or some other particular field of knowledge or information, not to Internet domains defined by domain names.)
- Focused indexing means that only a certain subset of available web pages will be indexed based on a certain rule.
- the content of a page that is indexed is analyzed and categorized. If it fits in a given list of interests, then the page is stored and the links that are stored in that page may be marked as candidates for further indexing in a web crawler like process.
- the rule may be that if the content of the page can be defined as "medical", including all the aspects of the medical area (doctors, patients, diseases, treatments, medications, hospitals, research, etc), the page should be included in a database where the topic is medical information.
- the topical domains may be categorized not only according to general topical information, but according to demographical information about the intended audience.
- categories could then be doctors, nurses, ambulance personnel, insurance agents, and lawyers.
- This categorization may be the top level category (e.g. the database contains information for doctors), or it may be a sub-category of the top category (e.g. within the field of medical information the user may specify a demographic sub group).
- This demographical categorization may serve two purposes. First, it helps target information more precisely: even if the topic is brain surgery, it can be assumed that a surgeon, a nurse, a patient and a lawyer are after different types of documents within that general topic.
- FIG. 1 is a block diagram of a system 100 configured to operate in accordance with an embodiment of the invention.
- the exemplary system 100 as illustrated in FIG. 1 includes three subsystems, a vertical database generation subsystem 110, a targeted messaging subsystem 120, and a searching subsystem 130.
- the system 100 may also include two databases, a vertical information database 140 and a targeted messaging database 150.
- the first subsystem 110 is vertical database generation system.
- this subsystem may include a classifier 112, a crawler module 113, an indexer 114 and a ranking system.
- the various modules are able to communicate over a common system bus 105, which may extend to or be replicated in the other subsystems, as will be further described below.
- the search parameter interface are used during creation, maintenance and expansion of the vertical search system. Over this interface, a definition of one or more domains may be entered into the system.
- the domains represent the vertical domains, or topical domains, that will be available in the system 100.
- a list of these domains may be stored in a taxonomy table in the vertical database 140.
- Topics to be included in the taxonomy table may be defined by a professional community (e.g. medical) and relate to various professions or topics within this community (e.g cardiology, radiology etc).
- a number of exemplary documents relevant to one or more topics may then be input into the system 100 over the interface 111.
- the sample documents may typically be selected by one or more persons representing the professional community.
- topics and documents may be added in order to expand or refine the search database over time.
- URLs may refer to seed pages, or sites, on the Internet or in some other repository of documents, as will be further described below.
- the sample documents may, according to aspects consistent with principles of the invention be passed to a classifier 112.
- the classifier may parse the sample documents and create a statistical representation of them, based e.g. on the number of times certain words occur. If the topic is cardiology, dominating words may typically be such words as heart, blood, cardiology, etc.
- the process of inputting sample documents into the classifier in order to generate these statistics may be referred to as training.
- the classifier will be able to classify additional documents. If, for example, an arbitrary document retrieved from the Internet is presented to the classifier 112, the classifier may parse the document, generate statistics and compare the statistics with the statistics created for the various topics during training. A measure of the similarity may then be generated, and this may be used as an indication of the degree to which the document can be considered as relevant to the particular topic.
- the metrics used in this process may be referred to as topic models or category models.
- the classifier may be configured to classify each document as belonging to the one category with which it is most similar, or alternatively a document may be considered as belonging to several categories. Also, documents belonging to the same category may all be considered equally relevant, or their relevance may be weighted based on the degree of similarity with the training data. Documents may also be rejected as not being relevant to any of the topics.
- the subsystem may further include a crawler 113.
- the crawler 113 may be delivered the URLs of a number of seed sites, or documents, as input.
- the selected sites may again be selected by one or more persons representing the professional community as representative quality documents.
- the documents may also be selected based on their assumed quality as starting points for the crawling process. This assumption will not only be based on the quality of the content of the document itself, but also on how they reference other documents, e.g. by way of hyperlinks, and the location and assumed quality of the referenced documents.
- the crawler 113 may parse the seed documents until it finds references to other documents. These referenced documents may then be retrieved and parsed in a similar manner for additional references to new documents. This process may be repeated, in principle indefinitely, and the number of collected documents will grow.
- a practical implementation of the crawler 113 may include the creation and maintenance of a crawler table where all URLs are stored. All documents referenced in the crawler table may then be revisited by the crawler 113 (i.e. retrieved again) at regular intervals. In this manner the crawler table is permanently updated and the indexed content, described further below, is refreshed.
- the collected documents may be forwarded to the classifier 112, as described above, and the classifier 112 may determine whether any given document is sufficiently relevant to be included in the database 140.
- the crawler 113 may operate independently of the classifier 112. Alternatively, the crawler 113 may be configured to not follow links out of documents that are determined to be irrelevant by the classifier 112, not to follow links out of irrelevant documents that were linked to by irrelevant documents, or some similar rule. Such a rule may be imposed in order to avoid crawling irrelevant areas of the network.
- the subsystem 110 may include an indexer 114.
- the indexer creates an index of all retrieved documents in order to facilitate searching.
- the documents that are classified as relevant may also be subjected to a ranking algorithm in a ranking module 115.
- ranking may be based on the degree of relevance found by the classifier.
- Other ranking algorithms may be used instead of or in addition to the relevance measure, including algorithms based on link analysis, search term frequency etc.
- the vertical database generation subsystem 110 may be connected to the actual database 140 over a communications link 160.
- This communications link may also connect to the other subsystems as further described below.
- the communications link may be part of a local area or wide area network, or it may be part of or an extension of the system bus 105.
- the various tables and results produced by the subsystem 110 may be stored in the database 140.
- the targeted messaging subsystem 120 may include a messaging input interface 121 and a messaging exposure statistics module 122. These modules may be interconnected by a system bus 105.
- the messaging input interface 121 may be an interface for receiving the messages that are to be exposed to users of the search system 100 in accordance with the invention.
- the interface may be a user interface such as a web page interface accessible over the Internet, or some other communications interface over which messages can be received in order to be stored in a database of targeted messages 150.
- Each received message may, in addition to the message itself, include conditions for its display, including an identification of a demographic group, or user group, for which it is intended.
- a message exposure statistics module 122 may register each exposure of the various messages. According to one embodiment where the targeted messages may be associated with several alternative exposure criteria, the statistics module 122 may also generate statistics regarding which of the various conditions have triggered exposure.
- the resulting generated statistics may according to some embodiments of the invention be accessible over the same interface 121 used for uploading messages to the system 100.
- a third subsystem may be the searching subsystem 130.
- This subsystem is accessible by users of the system 100 in order for such users to input search requests and receive search results and targeted messages.
- the search subsystem 130 may include a searching user interface 131, which may be based on a web server, and a search engine 132.
- the search engine 132 receives search requests that may include search terms, a demographic identification of the user performing the search, and additional categorical identifiers used to narrow the search. Based on this input the search engine will perform a search in the vertical database 140 based on the index stored there, and generate a response based on the relative ranking of the documents identified as hits, i.e. documents that fulfill the search criteria. In addition the search engine may use information regarding the demographic identification and alternatively also additional categorical identifiers or search terms to select one or more messages from the targeted messaging database 150. The selected messages will then be presented to the user over the user interface 131 together with the search results.
- the various subsystem may be tightly integrated into one, or distributed over several systems, according to design preferences.
- the two databases may be residing in the same database system or be distributed over two or more database systems.
- FIG. 2 shows a flow chart of a crawling and indexing process consistent with principles of the invention. It will be seen that according to the example illustrated in FIG. 2 the process may include two main branches, one where a classifier is trained to classify documents, and one where the web, or a subset of the web, is crawled and documents are retrieved.
- the system is provided with a taxonomy table in a step 201.
- the taxonomy table may be provided over the search parameter interface 111 as one or more topics to be stored in a taxonomy table.
- the classifier 112 using the taxonomy table as a starting point, is then trained on a set of exemplary documents in a step 202.
- the relevant topic models or category models are created 203.
- the classifier is now trained to classify documents as belonging to one or more categories.
- the degree to which a document belongs to a category may be weighted.
- a document may be relatively more relevant to doctors than to nurses.
- crawler seeds may be provided to the web crawler 113.
- the web crawler 113 starting with a number of seed documents, crawls the web or some defined subset of the web based on the seeds in a next step 205.
- Web crawling is well known in the art and can be summarized as a process where all documents specified as seed documents are retrieved and parsed. If any hyperlink referencing another document is found as a result of the parsing, this identified document is retrieved and parsed in the same manner.
- All documents that are retrieved during crawling, and that qualifies according to some quality criteria, are passed 206 to the classifier module 112 for classification. It will be realized by those with skill in the art that the step 205 of crawling the web may continue while documents that have already been retrieved are further processed.
- the classifier 112 will use the topic model from step 203 in order to classify 207 the retrieved documents. Documents that do not fit within the topic model may be filtered out. The classified documents may then be indexed 208 by the indexer 114 and ranked 209 by the ranking module 115 according to some of the methods known in the art, and then stored in a searchable database 140 in a step 210. This brings the process of retrieving documents to an end 211.
- FIG. 2 illustrates a simplified process with a well defined beginning and end
- a real life system would run the various parts of the process in parallel, and certain steps would be constantly repeated in order to add new documents to the database, refine the classification, improve ranking etc.
- FIG. 3 illustrates a process of entering messages in the targeted messages database 150.
- the process starts in a first step 300.
- the targeted messaging subsystem 120 receives a message over the messaging input interface 121.
- the received message may include one or more tags or pieces of meta information or some other form of attached information indicating which vertical domain or domains the message is relevant to.
- the received message is stored in the targeted messaging database 150.
- the entry may be indexed or otherwise identifiable based on the attached domain information in a manner that makes it possible to select messages based on the vertical domain or domains (topics) to which they are relevant. Additional criteria for selection may also be included, such as a designated priority.
- tracking of exposure may be established in a statistics module 122 in the messaging subsystem 120.
- the messaging system is an advertising system where advertisers pay according to exposure, or if the above mentioned priority assigned to a message is based on previous exposure.
- the registration process may also involve presenting a confirmation of the successful registration of the message in the database 150. Otherwise the process may end in a final step 304.
- FIG. 4 illustrates a process of performing a vertical search and retrieving search results and targeted messages.
- the process starts in a first step 400.
- a search subsystem 130 receives a request for a search entry form over search interface 131.
- the search entry form may be transmitted 402 as an HTML document, and will be further described below.
- a user of the vertical search system 100 may fill in his or her search request by entering search terms in the search form, and in addition, indicate the vertical domain the user wants to search, as well as other categorizing information that serves to narrow the search.
- the search form may be implemented as a toolbar installed on a user computer as a plug in to a web browser, as a widget application, or even as a separate client application.
- the request for the search form 401 will be a request to download the application of plug-in, and the transmittal will be the delivered download from a server.
- a step 403 the search request is received by the searching interface 131 and passed to the search engine 132.
- the search engine may then perform two searches, one in each database 140, 150. These searches may be performed sequentially or in parallel. In FIG. 4 they are illustrated as being performed in parallel, but this should be understood only as an illustrative example.
- One search 404 is in the vertical database 140 for information requested by the user.
- the other search 405 is in the targeted messaging database 150.
- the search in the vertical database 140 will use the definition of a topic or vertical domain in the search request to select which vertical domain to search. Additional information may also be present in order to limit the scope of the search. Search terms included in the search request, such as words, phrases or regular expressions, may be used to search for relevant information within the selected domain.
- the hits may then be returned based on how they were ranked by the ranking system 115. Additional ranking may be performed based on search terms or other forms of text analysis.
- the search in the targeted messaging database 150 may primarily be based on the selected vertical domain. Whether the domain defines a type of user, a particular topic or field of knowledge, or a combination of such categories, the messages in the database 150 may have been designated as relevant to only one or some of these categories.
- the search in the targeted messaging database 405 returns messages that are designated in a way that corresponds with the search performed in the vertical database 140. As an example, if the search in the vertical database 140 is performed in a domain relevant to doctors, the search in the messaging database 150 will return messages designated as relevant to doctors. One or more of the returned messages may then be selected sequentially, randomly, based on previous exposure as recorded by the exposure statistics module 122, based on assigned priority, based on keywords included in the search request, or some combination of these. Any combination of some or all of these alternatives, and even additional alternatives, may be implemented in a system operating in accordance with the principles of the present invention.
- the results may be combined into a result page.
- the result page may include some or all of the information, or references to information (e.g. URLs) retrieved from the vertical database 140 and one or more messages retrieved from the targeted messaging database 150.
- the result page may then be transmitted to the user in a step 407.
- the exposure statistics module 122 may update the exposure statistics for the message or messages transmitted as part of the result page. Alternatively, the exposure statistics is updated prior to or simultaneously with the transmittal of the result page.
- the update of the exposure statistics may be dependent on further action by the user.
- the exposure update may only be updated after the user actually clicks on a message hyperlink in order to request additional information or otherwise indicate that the information has been received and read.
- the exposure statistics may register initial transmission as well as later interaction by the user.
- FIG. 5 A and FIG. 5B show an example of how a toolbar representing a search form may look like to a user wishing to utilize a vertical search and messaging system according to the present invention.
- the toolbar may be implemented as a plug-in to a web browser and displayed as part of the web browser user window on the display of a user computer.
- FIG. 5 A shows the toolbar 500 A before the user has entered any search terms.
- the toolbar 500 A includes a number of fields for entering or selecting search terms and vertical domain. It will be understood that the number and nature of the fields illustrated in the drawing are exemplary only, and that other alternatives are possible within the scope of the invention.
- a user may enter one or more keywords, search terms or a regular expression of such terms, dependent on the capabilities of the particular embodiment of the search engine 132 and the vertical database 140 the user is interacting with.
- the user has entered "Coated Stent" in the keyword field 501.
- the next field 502 the user can select community.
- the chosen community is "Health & Medical”.
- Additional fields include "profession” 503, “category” 504 and "sub-category” 506. In FIG. 5 B these have been chosen as “MD”, “Medication” and “Cardiology", respectively.
- FIG. 5 B these have been chosen as “MD”, “Medication” and “Cardiology", respectively.
- filter category 506 and “filter sub-category” 507, which may serve to exclude hits, again dependent on the functionality implemented in the system 100, particularly the search engine 132.
- filter category and sub category are chosen as “Treatment” and “Infarction”, respectively.
- the plug- in may then be configured to transmit a search request to the system 100.
- the search request may typically include the various values or words entered by the user, as well as necessary information such as the address of the client computer used by the user, in accordance with the communications protocol used between the client computer and the searching interface 131 of the system 100.
- the search engine 132 When the search request is received by the searching interface it is forwarded to the search engine 132.
- the search engine will then search the database based on the vertical domain defined by the selections made by the user, including one or more of community 502, profession 503, category 504 and sub-category 505. It should be noted that the vertical domain may be defined by only one, or by several of the values selected by the user. Values selected by the user, but not used to define the vertical domain, may be used to specify the search in other ways, or they may only be used when selecting a message from the targeted messaging database 150.
- the filter values 506, 507 may be used to remove hits that otherwise would be returned by the search.
- the various values, words or terms included in the search may also be used when the search of the messaging database is performed 150. According to one embodiment only one value is used. This term may e.g. be the profession category 503 in the example illustrated in FIG. 5 A and FIG. 5 B. According to alternative embodiments, some or all of the additional categories, the filter categories and even the keywords or search terms may be used to perform the search and select messages from the targeted messaging database 150.
- the use of categories and sub-categories and/or filters can be used not only to make the search results in the vertical database more relevant to the user, but also to improve the choice of the targeted message(s).
- the messages may be tagged, or categorized, according to one or more vertical domains, to sub- categories of these domains, and possibly also to filter categories or refinement categories. Messages may then be selected only when as many tags in the search request as possible correspond with those of the message.
- a number of different messages can all be categorized as relevant to a given profession, such as medical doctor (MD). If the user only specifies his profession 503, any of these messages may be selected. Similarly, if only the category 504 is selected, for instance medication, any message relevant to medication may be selected. However, if both profession 503 and category 504 are specified as MD and medication, respectively, it will be possible to select messages where both of these are part of the described targeting of the message.
- MD medical doctor
- a message, or a link to a message, that fulfills all these criteria may be selected, such as: "Nocotinic Acid. A new drug for treating high cholesterol.” The relevance and the frequency of interaction (clicking) for this message will be much higher for the user.
- the exposure statistics module 122 may be configured to register which of the different alternatives triggered each exposure (e.g. if a message was displayed to an MD or a nurse).
- the system 100 may be installed on one or more server computers using one of a number of well known operating systems such as one of the many variants of Unix (UNIX is a registered trademark of The Open Group), Linux (Linux is a registered trademark of Linus Torvalds) or Windows (Windows is a registered trademark of Microsoft Corporation).
- the databases may be based on a database management system (DBMS) that is able to interface to a search engine 132 and a web server.
- DBMS database management system
- the various interfaces 111, 121, 131 may be implemented in a number of ways. According to some embodiments at least some of the interfaces are implemented using a scripting or other programming language (such as PHP, Perl, CGI, Java, JavaScript, ASP) running in conjunction with a web server. Communication may then be based on protocols such as TCP/IP and HTTP, but those with skill in the art will realize that alternatives may be used without departing from the scope and spirit of the invention. Particularly may communication between the various subsystems of the system 100 be implemented using other open or proprietary communications protocols. , Users of the system may communicate with the system 100 from client computers connected to a communications network such as the Internet. Alternatively, users may communicate with the system using computers connected to a local area network on the same premises as the system 100, or even using terminals directly connected to the system 100.
- a scripting or other programming language such as PHP, Perl, CGI, Java, JavaScript, ASP
- Communication may then be based on protocols such as TCP/IP and HTTP, but those with
- the invention also includes a computer program product stored on a computer readable media such as a CD-ROM, a DVD or a hard drive, and including instructions capable of performing the various steps of the invention when installed and executed on a computer system.
- a computer program product stored on a computer readable media such as a CD-ROM, a DVD or a hard drive, and including instructions capable of performing the various steps of the invention when installed and executed on a computer system.
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A method and a system for providing a targeted message in conjunction with a search for information in a computer system. A first database includes searchable references to publications, said references being indexed according to a plurality of domains. A second database includes messages associated with identifiers indicating relevance to at least one of said domains. When a request to perform a search of said first database is received, said request including a designation of which one of said plurality of domains to search and one or more search criteria, a search of references indexed as belonging to said designated domain is performed in said first database, while at least one message specified as relevant to said designated domain is selected from said second database. In response to said request, a list of at least some references found as a result of said search are transmitted along with said at least one selected message.
Description
METHOD OF TARGETING MESSAGES
BACKGROUND OF THE INVENTION
1. Field of The Invention
The present invention relates to a method and a system for targeting messages. In particular, the invention describes a method and a system for providing messages to users of a vertical search system based on user input.
2. Background of The Invention
A wide variety of services are currently being provided electronically to users of the Internet. Many such services involve receiving information from a user and performing a search for a publication corresponding to the information received from the user. In this context a publication can be a particular web page, a reference to a printed publication, a description of goods for sale, a record in a database, or almost any piece of information that can be referenced and that can be identified in some meaningful manner based on the search input from the user.
Providers of search services may often want to provide a user with additional information when the results of the search are presented. If the search is for a particular type of item in an online catalog of goods, the provider may want to present information regarding similar types of goods, and if the search is for a web page describing a geographical location, the search provider may want to present an advertisement for a travel agency or an airline servicing that location.
Systems have been developed for selecting messages based on the search input information received from the user and to present the selected messages based on the theory that a user will be more likely to show an interest in a message that is in some sense related to the search performed. Such methods have in particular been used in order to combine search services with advertising.
Present systems are limited in the sense that the search input provided by the user does not provide any demographic information about the user. In order to select messages based on demographics it has been necessary either to provide a system that requires user registration, or more or less intrusive systems that collect statistics on user behavior and classifies user demographics based on an analysis of such demographics. Users are generally less likely to use such systems.
SUMMARY OF THE INVENTION
The present invention makes use of the fact that in order to perform a vertical search, users will have an interest in providing information about themselves because it helps narrow the search and filter out irrelevant information. A vertical search is a search for information that has already been classified as belonging to some domain, such as a particular business or profession.
A vertical search service may, according to the invention, provide a first database of searchable references to publications, where the references are classified as belonging to at least one of a plurality of domains. The various domains represent different vertical fields of search. Such a service may then provide a second database of messages that are specified as being relevant to at least one of the domains. When a request to perform a vertical search is received by the service it will include a specification of which domain to search. In addition to performing the search within the specified domain, a message may then be selected based on the specification of domain. This message may then be transmitted back to the user along with the results of the search.
The invention further provides for a specification of the various domains, the classification of messages, and the design of the user interface presented to users to be complementary, such that users are encouraged to provide information about themselves, their profession, hobbies, specialties etc in a manner that facilitates efficient searching as well as well targeted selection of messages.
According to an embodiment consistent with the principles of the present invention, a request to perform a search is received in response to a presentation to the user of a user interface where the user can select one of a plurality of domains, enter one or more search criteria and cause said selected domain and said search criteria to be transmitted.
The messages may include one or more media types, such as text, image, video, animation and audio. In addition to the association of the messages to one or more of the vertical domains, additional identifiers associated with the messages can be used to select not only based on domain, but also by an evaluation of correspondence between the search criteria provided by the user and the additional identifiers.
According to a further embodiment consistent with the principles of the present invention, the messages may include a component with which a user can interact when the message is displayed on a client computer. Such a component may be a hyperlink or some similar device allowing the user to interact with the message. In response to user interaction with such a component, a short message may be transmitted back to the search service where a record of received tokens are kept.
Such a record can be used to generate statistics of how often the various messages have been displayed and caused a user to interact with them.
The present invention provides a method and a system for providing the various features and advantages of the invention, as specified in the appended claims.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 illustrates a system consistent with an embodiment of the invention; FIG. 2 illustrates in a flow chart, a crawling and indexing process; FIG. 3 illustrates a process of entering messages in a targeted messages database;
Fig. 4 illustrates a process of performing a vertical search and retrieving search results and targeted messages; and
Fig. 5 A and Fig. 5 B show a tool bar used as a search form for performing a search. DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS
The rapid growth of the world-wide web poses unprecedented scaling challenges for general- purpose search engines. According to the present invention, a new vertical search system may be based on a combination of controlled crawling, classification and domain specific indexing. This may make it possible to selectively seek out and index pages that are relevant to a professional domain, or some other domain defined by how the pages relate to a pre-defined topic or set of topics. The topics may be specified by keywords, or alternatively by using exemplary documents. Rather than collecting and indexing all accessible web documents to be able to answer all possible ad-hoc queries, the vertical search system indexes pages that are likely to be most relevant for a particular domain and avoids irrelevant regions of the web. (It should be understood that domain here refers to the topical domain defined by a profession, a hobby, or some other particular field of knowledge or information, not to Internet domains defined by domain names.)
The seeking out and indexing of pages relevant to a specific topic may be referred to as focused indexing. Focused indexing means that only a certain subset of available web pages will be indexed based on a certain rule. The content of a page that is indexed is analyzed and categorized. If it fits in a given list of interests, then the page is stored and the links that are stored in that page may be marked as candidates for further indexing in a web crawler like process. As an example the rule may be that if the content of the page can be defined as "medical", including all the aspects of the
medical area (doctors, patients, diseases, treatments, medications, hospitals, research, etc), the page should be included in a database where the topic is medical information.
According to certain principles of the invention, the topical domains may be categorized not only according to general topical information, but according to demographical information about the intended audience. Continuing with the medical example, categories could then be doctors, nurses, ambulance personnel, insurance agents, and lawyers. This categorization may be the top level category (e.g. the database contains information for doctors), or it may be a sub-category of the top category (e.g. within the field of medical information the user may specify a demographic sub group). This demographical categorization may serve two purposes. First, it helps target information more precisely: even if the topic is brain surgery, it can be assumed that a surgeon, a nurse, a patient and a lawyer are after different types of documents within that general topic. Second, it helps the provider of the search system attach additional information to the user based on demographical information provided by the user. This could be used for general messaging purposes, e.g. a university may want to bundle messages to their professors, staff and students when they use information retrieval systems, but they may not want to send the same information to all groups, or it could be used for advertising or other types of messaging from third party information providers (i.e. information providers that do not necessarily provide any of the information the user actually searches for).
Reference is first made to FIG. 1, which is a block diagram of a system 100 configured to operate in accordance with an embodiment of the invention.
The exemplary system 100 as illustrated in FIG. 1 includes three subsystems, a vertical database generation subsystem 110, a targeted messaging subsystem 120, and a searching subsystem 130. The system 100 may also include two databases, a vertical information database 140 and a targeted messaging database 150.
The first subsystem 110 is vertical database generation system. In addition to a user interface 111, this subsystem may include a classifier 112, a crawler module 113, an indexer 114 and a ranking system. The various modules are able to communicate over a common system bus 105, which may extend to or be replicated in the other subsystems, as will be further described below.
It will be understood by those skilled in the art that the various modules may consist of a combination of hardware and software components, including standard computer system components such as processors, memory, input/output units etc, which for the sake of simplicity are not shown in the drawing.
The search parameter interface are used during creation, maintenance and expansion of the vertical search system. Over this interface, a definition of one or more domains may be entered into the system. The domains represent the vertical domains, or topical domains, that will be available in the system 100. A list of these domains may be stored in a taxonomy table in the vertical database 140. Topics to be included in the taxonomy table may be defined by a professional community (e.g. medical) and relate to various professions or topics within this community (e.g cardiology, radiology etc).
Further, a number of exemplary documents relevant to one or more topics may then be input into the system 100 over the interface 111. The sample documents may typically be selected by one or more persons representing the professional community. According to some aspects of the invention, topics and documents may be added in order to expand or refine the search database over time.
Finally, over the interface I l i a number of seed URLs may input into the subsystem 110. These URLs may refer to seed pages, or sites, on the Internet or in some other repository of documents, as will be further described below.
The sample documents may, according to aspects consistent with principles of the invention be passed to a classifier 112. The classifier may parse the sample documents and create a statistical representation of them, based e.g. on the number of times certain words occur. If the topic is cardiology, dominating words may typically be such words as heart, blood, cardiology, etc.
The process of inputting sample documents into the classifier in order to generate these statistics may be referred to as training.
Based on the various statistics generated by the classifier for the various topics in the taxonomy table, the classifier will be able to classify additional documents. If, for example, an arbitrary document retrieved from the Internet is presented to the classifier 112, the classifier may parse the document, generate statistics and compare the statistics with the statistics created for the various topics during training. A measure of the similarity may then be generated, and this may be used as an indication of the degree to which the document can be considered as relevant to the particular topic. The metrics used in this process may be referred to as topic models or category models.
The classifier may be configured to classify each document as belonging to the one category with which it is most similar, or alternatively a document may be considered as belonging to several categories. Also, documents belonging to the same category may all be considered equally relevant, or their relevance may be weighted based on the degree of similarity with the training data.
Documents may also be rejected as not being relevant to any of the topics.
Various techniques for text classification are known by those with skill in the art. For an example, reference is made to "Text Categorization with Support Vector Machines: Learning with Many Relevant Features", by Thorsten Joachims, University at Dortmund, Informatik LS8, Baroper Str. 301, 44221 Dortmund, Germany, which is hereby incorporated by reference.
Further methods of ranking documents relative to each other will be discussed below.
In order to obtain documents for inclusion in the search database 140 the subsystem may further include a crawler 113. The crawler 113 may be delivered the URLs of a number of seed sites, or documents, as input. The selected sites may again be selected by one or more persons representing the professional community as representative quality documents. However, the documents may also be selected based on their assumed quality as starting points for the crawling process. This assumption will not only be based on the quality of the content of the document itself, but also on how they reference other documents, e.g. by way of hyperlinks, and the location and assumed quality of the referenced documents.
The crawler 113 may parse the seed documents until it finds references to other documents. These referenced documents may then be retrieved and parsed in a similar manner for additional references to new documents. This process may be repeated, in principle indefinitely, and the number of collected documents will grow. A practical implementation of the crawler 113 may include the creation and maintenance of a crawler table where all URLs are stored. All documents referenced in the crawler table may then be revisited by the crawler 113 (i.e. retrieved again) at regular intervals. In this manner the crawler table is permanently updated and the indexed content, described further below, is refreshed.
The collected documents may be forwarded to the classifier 112, as described above, and the classifier 112 may determine whether any given document is sufficiently relevant to be included in the database 140.
As a matter of design choice, the crawler 113 may operate independently of the classifier 112. Alternatively, the crawler 113 may be configured to not follow links out of documents that are determined to be irrelevant by the classifier 112, not to follow links out of irrelevant documents that were linked to by irrelevant documents, or some similar rule. Such a rule may be imposed in order to avoid crawling irrelevant areas of the network.
In order to further process a document that has been classified as relevant to one or more topics, the
subsystem 110 may include an indexer 114. The indexer creates an index of all retrieved documents in order to facilitate searching.
The documents that are classified as relevant may also be subjected to a ranking algorithm in a ranking module 115. As already mentioned, ranking may be based on the degree of relevance found by the classifier. Other ranking algorithms may be used instead of or in addition to the relevance measure, including algorithms based on link analysis, search term frequency etc.
The vertical database generation subsystem 110 may be connected to the actual database 140 over a communications link 160. This communications link may also connect to the other subsystems as further described below. The communications link may be part of a local area or wide area network, or it may be part of or an extension of the system bus 105.
The various tables and results produced by the subsystem 110 may be stored in the database 140.
The targeted messaging subsystem 120 may include a messaging input interface 121 and a messaging exposure statistics module 122. These modules may be interconnected by a system bus 105.
The messaging input interface 121 may be an interface for receiving the messages that are to be exposed to users of the search system 100 in accordance with the invention. The interface may be a user interface such as a web page interface accessible over the Internet, or some other communications interface over which messages can be received in order to be stored in a database of targeted messages 150. Each received message may, in addition to the message itself, include conditions for its display, including an identification of a demographic group, or user group, for which it is intended.
A message exposure statistics module 122 may register each exposure of the various messages. According to one embodiment where the targeted messages may be associated with several alternative exposure criteria, the statistics module 122 may also generate statistics regarding which of the various conditions have triggered exposure.
The resulting generated statistics may according to some embodiments of the invention be accessible over the same interface 121 used for uploading messages to the system 100.
A third subsystem may be the searching subsystem 130. This subsystem is accessible by users of the system 100 in order for such users to input search requests and receive search results and targeted messages. The search subsystem 130 may include a searching user interface 131, which
may be based on a web server, and a search engine 132.
The search interface will be described in further detail below. Over the search interface 131, the search engine 132 receives search requests that may include search terms, a demographic identification of the user performing the search, and additional categorical identifiers used to narrow the search. Based on this input the search engine will perform a search in the vertical database 140 based on the index stored there, and generate a response based on the relative ranking of the documents identified as hits, i.e. documents that fulfill the search criteria. In addition the search engine may use information regarding the demographic identification and alternatively also additional categorical identifiers or search terms to select one or more messages from the targeted messaging database 150. The selected messages will then be presented to the user over the user interface 131 together with the search results.
It will be understood by those with skill in the art that the various subsystem may be tightly integrated into one, or distributed over several systems, according to design preferences. Similarly, the two databases may be residing in the same database system or be distributed over two or more database systems.
Reference is now made to FIG. 2, which shows a flow chart of a crawling and indexing process consistent with principles of the invention. It will be seen that according to the example illustrated in FIG. 2 the process may include two main branches, one where a classifier is trained to classify documents, and one where the web, or a subset of the web, is crawled and documents are retrieved.
Following the start of the process in a startup step 200, the system is provided with a taxonomy table in a step 201. As described above, the taxonomy table may be provided over the search parameter interface 111 as one or more topics to be stored in a taxonomy table. The classifier 112, using the taxonomy table as a starting point, is then trained on a set of exemplary documents in a step 202.
Based on the results of the training, the relevant topic models or category models are created 203. The classifier is now trained to classify documents as belonging to one or more categories. According to some embodiments the degree to which a document belongs to a category may be weighted. As an example, a document may be relatively more relevant to doctors than to nurses.
In a step 204 crawler seeds may be provided to the web crawler 113. The web crawler 113, starting with a number of seed documents, crawls the web or some defined subset of the web based on the seeds in a next step 205. Web crawling is well known in the art and can be summarized as a process
where all documents specified as seed documents are retrieved and parsed. If any hyperlink referencing another document is found as a result of the parsing, this identified document is retrieved and parsed in the same manner.
All documents that are retrieved during crawling, and that qualifies according to some quality criteria, are passed 206 to the classifier module 112 for classification. It will be realized by those with skill in the art that the step 205 of crawling the web may continue while documents that have already been retrieved are further processed.
The classifier 112 will use the topic model from step 203 in order to classify 207 the retrieved documents. Documents that do not fit within the topic model may be filtered out. The classified documents may then be indexed 208 by the indexer 114 and ranked 209 by the ranking module 115 according to some of the methods known in the art, and then stored in a searchable database 140 in a step 210. This brings the process of retrieving documents to an end 211.
It will be realized that while FIG. 2 illustrates a simplified process with a well defined beginning and end, a real life system would run the various parts of the process in parallel, and certain steps would be constantly repeated in order to add new documents to the database, refine the classification, improve ranking etc.
Reference is now made to FIG. 3, which illustrates a process of entering messages in the targeted messages database 150. The process starts in a first step 300. In a next step 301 the targeted messaging subsystem 120 receives a message over the messaging input interface 121. The received message may include one or more tags or pieces of meta information or some other form of attached information indicating which vertical domain or domains the message is relevant to.
In a next step 302, the received message is stored in the targeted messaging database 150. The entry may be indexed or otherwise identifiable based on the attached domain information in a manner that makes it possible to select messages based on the vertical domain or domains (topics) to which they are relevant. Additional criteria for selection may also be included, such as a designated priority.
In addition to establishing a record or entry in the database 150, tracking of exposure may be established in a statistics module 122 in the messaging subsystem 120. For some embodiments of the invention it may not be necessary to establish such tracking, but it may be advantageous if the messaging system is an advertising system where advertisers pay according to exposure, or if the above mentioned priority assigned to a message is based on previous exposure.
The registration process may also involve presenting a confirmation of the successful registration of
the message in the database 150. Otherwise the process may end in a final step 304.
Reference is now made to FIG. 4, which illustrates a process of performing a vertical search and retrieving search results and targeted messages. The process starts in a first step 400.
In a first step 401 the search subsystem 130 receives a request for a search entry form over search interface 131. The search entry form may be transmitted 402 as an HTML document, and will be further described below. A user of the vertical search system 100 may fill in his or her search request by entering search terms in the search form, and in addition, indicate the vertical domain the user wants to search, as well as other categorizing information that serves to narrow the search.
Alternatively the search form may be implemented as a toolbar installed on a user computer as a plug in to a web browser, as a widget application, or even as a separate client application. According to such an embodiment, the request for the search form 401 will be a request to download the application of plug-in, and the transmittal will be the delivered download from a server.
In a step 403 the search request is received by the searching interface 131 and passed to the search engine 132. The search engine may then perform two searches, one in each database 140, 150. These searches may be performed sequentially or in parallel. In FIG. 4 they are illustrated as being performed in parallel, but this should be understood only as an illustrative example.
One search 404 is in the vertical database 140 for information requested by the user. The other search 405 is in the targeted messaging database 150. The search in the vertical database 140 will use the definition of a topic or vertical domain in the search request to select which vertical domain to search. Additional information may also be present in order to limit the scope of the search. Search terms included in the search request, such as words, phrases or regular expressions, may be used to search for relevant information within the selected domain. The hits may then be returned based on how they were ranked by the ranking system 115. Additional ranking may be performed based on search terms or other forms of text analysis.
The search in the targeted messaging database 150 may primarily be based on the selected vertical domain. Whether the domain defines a type of user, a particular topic or field of knowledge, or a combination of such categories, the messages in the database 150 may have been designated as relevant to only one or some of these categories. The search in the targeted messaging database 405 returns messages that are designated in a way that corresponds with the search performed in the vertical database 140. As an example, if the search in the vertical database 140 is performed in a
domain relevant to doctors, the search in the messaging database 150 will return messages designated as relevant to doctors. One or more of the returned messages may then be selected sequentially, randomly, based on previous exposure as recorded by the exposure statistics module 122, based on assigned priority, based on keywords included in the search request, or some combination of these. Any combination of some or all of these alternatives, and even additional alternatives, may be implemented in a system operating in accordance with the principles of the present invention.
In a following step 406 the results may be combined into a result page. The result page may include some or all of the information, or references to information (e.g. URLs) retrieved from the vertical database 140 and one or more messages retrieved from the targeted messaging database 150. The result page may then be transmitted to the user in a step 407. Following the transmittal of the result page, the exposure statistics module 122 may update the exposure statistics for the message or messages transmitted as part of the result page. Alternatively, the exposure statistics is updated prior to or simultaneously with the transmittal of the result page.
According to some embodiments consistent with the invention, the update of the exposure statistics may be dependent on further action by the user. As an example, the exposure update may only be updated after the user actually clicks on a message hyperlink in order to request additional information or otherwise indicate that the information has been received and read. Alternatively, the exposure statistics may register initial transmission as well as later interaction by the user.
Reference is now made to FIG. 5 A and FIG. 5B, which show an example of how a toolbar representing a search form may look like to a user wishing to utilize a vertical search and messaging system according to the present invention. The toolbar may be implemented as a plug-in to a web browser and displayed as part of the web browser user window on the display of a user computer.
FIG. 5 A shows the toolbar 500 A before the user has entered any search terms. The toolbar 500 A includes a number of fields for entering or selecting search terms and vertical domain. It will be understood that the number and nature of the fields illustrated in the drawing are exemplary only, and that other alternatives are possible within the scope of the invention.
In a first field 501 a user may enter one or more keywords, search terms or a regular expression of such terms, dependent on the capabilities of the particular embodiment of the search engine 132 and the vertical database 140 the user is interacting with. In the illustrated example the user has entered "Coated Stent" in the keyword field 501. In the next field 502, according to this example, the user can select community. In FIG. 5 B the chosen community is "Health & Medical". Additional fields
include "profession" 503, "category" 504 and "sub-category" 506. In FIG. 5 B these have been chosen as "MD", "Medication" and "Cardiology", respectively. According to the example illustrated in FIG. 5 A two additional fields include "filter category" 506 and "filter sub-category" 507, which may serve to exclude hits, again dependent on the functionality implemented in the system 100, particularly the search engine 132. In the example illustrated in FIG. 5 B the filter category and sub category are chosen as "Treatment" and "Infarction", respectively.
When the user is ready to perform the search, he or she may click on a search button 510. The plug- in may then be configured to transmit a search request to the system 100. The search request may typically include the various values or words entered by the user, as well as necessary information such as the address of the client computer used by the user, in accordance with the communications protocol used between the client computer and the searching interface 131 of the system 100.
When the search request is received by the searching interface it is forwarded to the search engine 132. The search engine will then search the database based on the vertical domain defined by the selections made by the user, including one or more of community 502, profession 503, category 504 and sub-category 505. It should be noted that the vertical domain may be defined by only one, or by several of the values selected by the user. Values selected by the user, but not used to define the vertical domain, may be used to specify the search in other ways, or they may only be used when selecting a message from the targeted messaging database 150.
The filter values 506, 507 may be used to remove hits that otherwise would be returned by the search.
The various values, words or terms included in the search may also be used when the search of the messaging database is performed 150. According to one embodiment only one value is used. This term may e.g. be the profession category 503 in the example illustrated in FIG. 5 A and FIG. 5 B. According to alternative embodiments, some or all of the additional categories, the filter categories and even the keywords or search terms may be used to perform the search and select messages from the targeted messaging database 150.
According to one embodiment consistent with principles of the invention, the use of categories and sub-categories and/or filters can be used not only to make the search results in the vertical database more relevant to the user, but also to improve the choice of the targeted message(s). In this case the messages may be tagged, or categorized, according to one or more vertical domains, to sub- categories of these domains, and possibly also to filter categories or refinement categories. Messages may then be selected only when as many tags in the search request as possible correspond
with those of the message.
As an example, a number of different messages can all be categorized as relevant to a given profession, such as medical doctor (MD). If the user only specifies his profession 503, any of these messages may be selected. Similarly, if only the category 504 is selected, for instance medication, any message relevant to medication may be selected. However, if both profession 503 and category 504 are specified as MD and medication, respectively, it will be possible to select messages where both of these are part of the described targeting of the message.
When the user narrows his search by selecting MD (profession 503) and Medication (category 504) and Cardiology (sub-category 505) and Treatment (filter category 506) and Cholesterol (filter sub- category 507), a message, or a link to a message, that fulfills all these criteria may be selected, such as: "Nocotinic Acid. A new drug for treating high cholesterol." The relevance and the frequency of interaction (clicking) for this message will be much higher for the user.
When the message provider enters the targeted message over the message input interface 121, he may be asked to enter as much information as possible regarding categories. According to some embodiments of the invention, multiple alternatives may be selected for vertical domain (e.g. several professions), category, sub-category etc. When this is the case, the exposure statistics module 122 may be configured to register which of the different alternatives triggered each exposure (e.g. if a message was displayed to an MD or a nurse).
It will be realized by those with skill in the art that various types of computer systems, communications infrastructures etc. can be used when implementing the invention. The system 100 may be installed on one or more server computers using one of a number of well known operating systems such as one of the many variants of Unix (UNIX is a registered trademark of The Open Group), Linux (Linux is a registered trademark of Linus Torvalds) or Windows (Windows is a registered trademark of Microsoft Corporation). The databases may be based on a database management system (DBMS) that is able to interface to a search engine 132 and a web server.
The various interfaces 111, 121, 131 may be implemented in a number of ways. According to some embodiments at least some of the interfaces are implemented using a scripting or other programming language (such as PHP, Perl, CGI, Java, JavaScript, ASP) running in conjunction with a web server. Communication may then be based on protocols such as TCP/IP and HTTP, but those with skill in the art will realize that alternatives may be used without departing from the scope and spirit of the invention. Particularly may communication between the various subsystems of the system 100 be implemented using other open or proprietary communications protocols. ,
Users of the system may communicate with the system 100 from client computers connected to a communications network such as the Internet. Alternatively, users may communicate with the system using computers connected to a local area network on the same premises as the system 100, or even using terminals directly connected to the system 100.
The invention also includes a computer program product stored on a computer readable media such as a CD-ROM, a DVD or a hard drive, and including instructions capable of performing the various steps of the invention when installed and executed on a computer system.
Claims
1. A method for providing a targeted message in conjunction with a search for information in a computer system, comprising: providing a first database of searchable references to publications, said references being indexed according to a plurality of domains; providing a second database of messages, said messages being associated with identifiers indicating relevance to at least one of said domains; receiving a request to perform a search of said first database, said request including a designation of which one of said plurality of domains to search and one or more search criteria; performing a search of references indexed as belonging to said designated domain; selecting at least one message specified as relevant to said designated domain from said second database; and transmitting, in response to said request, a list of one or more references found as a result of said search along with said at least one selected message.
2. The method according to claim 1 , wherein each of said domains represent a profession.
3. Method according to claim 1, wherein said request to perform a search is received in response to the presentation of a user interface where a user can select one of a plurality of domains, enter one or more search criteria and cause said selected domain and said search criteria to be transmitted.
4. The method according to claim 1, wherein said messages in said second database are messages including one or more of the following mediatypes: text, image, video, animation, audio.
5. The method according to claim 1, further comprising: associating said messages in said second database with additional identifiers; and selecting from among messages specified as relevant to said designated domain based on an evaluation of correspondence between said one or more search criteria and said additional identifiers.
6. The method according to claim 5, wherein at least one of said one or more additional identifiers indicates a sub-category of one of said plurality of domains, and at least one of said search criteria is a designation of one or more such subcategories.
7. The method according to claim 1 , further comprising: including, in one or more of said messages, a component with which a user can interact if the message is displayed on a user computer; receiving a token which is caused to be transmitted from a user computer when a user chooses to interact with a component of a message; and creating a record of received tokens.
8. A system for providing a targeted message in conjunction with a search for information in a computer system, comprising: a first database of searchable references to publications, said references being indexed according to a plurality of domains; a second database of messages, said messages being associated with identifiers indicating relevance to at least one of said domains; a search engine configured to
- receive a request to perform a search of said first database, said request including a designation of which one of said plurality of domains to search and one or more search criteria;
- perform a search of references indexed as belonging to said designated domain;
- select at least one message specified as relevant to said designated domain from said second database; and
- transmit, in response to said request, a list of one or more references found as a result of said search along with said at least one selected message.
9. The system according to claim 8, wherein each of said domains represent a profession.
10. The system according to claim 8, further comprising: a searching interface module configured to generate a presentation of a user interface where a user can select at least one of a plurality of domains, enter one or more search criteria and cause said selected domain and said search criteria to be transmitted to the system.
11. The system according to claim 8, wherein said messages in said second database are messages including one or more of the following mediatypes: text, image, video, animation, audio.
12. The system according to claim 8, wherein said messages in said second database are associated with one or more additional identifiers; and selecting from among messages specified as relevant to said designated domain based on an evaluation of correspondence between said one or more search criteria and said one or more additional identifiers.
13. The system according to claim 12, wherein at least one of said one or more additional identifiers indicates a sub-category of one of said plurality of domains, and at least one of said search criteria is a designation of one or more such subcategories.
14. The system according to claim 8, further comprising: a search parameter interface module configured to receive
- a plurality of domain designations;
- exemplary documents each designated as being relevant to at least one of said plurality of domains;
- one or more references to external documents containing references to additional external documents; a classifier module configured to parse exemplary documents, create a model for determining relevance based on said exemplary documents, and determine relevance of additional documents based on the model created from the exemplary documents; a crawler module configured to crawl a web of referenced documents based said references to external documents and retrieve documents identified during the crawling process; an indexer module configured to create an index based of documents retrieved during said crawling process; and a ranking system configured to rank documents based on a ranking algorithm.
15. The system according to claim 8, further comprising: a messaging input interface configured to receive messages associated with identifiers indicating relevance to at least one of said domains; and an exposure statistics module configured to generate statistics regarding the selection and transmission of said messages.
16. The system according to claim 15, wherein one or more of said messages in said second database includes a component with which a user can interact if the message is displayed on a user computer; and said exposure statistics module is configured to receive a token which is caused to be transmitted from a user computer when a user chooses to interact with a component of a message and create a record of received tokens.
17. Computer program product stored on a computer readable media and including instructions capable of performing the method of one of the claims 1 - 7 when installed and executed on a computer system.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
NO20063813 | 2006-08-25 | ||
NO20063813 | 2006-08-25 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2008023994A1 true WO2008023994A1 (en) | 2008-02-28 |
Family
ID=38617440
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/NO2007/000301 WO2008023994A1 (en) | 2006-08-25 | 2007-08-27 | Method of targeting messages |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2008023994A1 (en) |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2006012120A2 (en) * | 2004-06-24 | 2006-02-02 | Google Inc. | Results based personalization of advertisements in a search engine |
-
2007
- 2007-08-27 WO PCT/NO2007/000301 patent/WO2008023994A1/en active Application Filing
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2006012120A2 (en) * | 2004-06-24 | 2006-02-02 | Google Inc. | Results based personalization of advertisements in a search engine |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Chen et al. | CI Spider: a tool for competitive intelligence on the Web | |
Manganello et al. | A study of quantitative content analysis of health messages in US media from 1985 to 2005 | |
US6671682B1 (en) | Method and system for performing tasks on a computer network using user personas | |
US8775197B2 (en) | Personalized health history system with accommodation for consumer health terminology | |
Liu et al. | Discovering unexpected information from your competitors' web sites | |
US20060004732A1 (en) | Search engine methods and systems for generating relevant search results and advertisements | |
US20070265996A1 (en) | Search engine methods and systems for displaying relevant topics | |
US20070271255A1 (en) | Reverse search-engine | |
US9092756B2 (en) | Information-retrieval systems, methods and software with content relevancy enhancements | |
WO2001024038A2 (en) | Internet brokering service based upon individual health profiles | |
WO2002103578A1 (en) | Dynamic search engine and database | |
US20080306914A1 (en) | Method and system for performing a search | |
US20030217056A1 (en) | Method and computer program for collecting, rating, and making available electronic information | |
US11960511B2 (en) | Methods and systems for supply chain analytics | |
Terai et al. | Differences between informational and transactional tasks in information seeking on the web | |
Greenberg et al. | Library metrics–studying academic users’ information retrieval behavior: A case study of an Israeli university library | |
Ndumbaro | Understanding user-system interactions: An analysis of OPAC users’ digital footprints | |
Herder | Forward, back and home again-analyzing user behavior on the web | |
Zhang et al. | Relationship between the metadata and relevance criteria of scientific data | |
US20110289081A1 (en) | Response relevance determination for a computerized information search and indexing method, software and device | |
Koo et al. | Improving Web searches: case study of quit-smoking Web sites for teenagers | |
WO2008023994A1 (en) | Method of targeting messages | |
Soyizwapi | Use of electronic databases by postgraduate students in the Faculty of Science and Agriculture at the University of KwaZulu-Natal, Pietermaritzburg. | |
Umagandhi et al. | Search Query Recommendations using Hybrid User Profile with Query Logs | |
Silva et al. | Context aware retrieval of health information on the web |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
NENP | Non-entry into the national phase |
Ref country code: RU |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 07808615 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC, EPO FORM 1205A DATED 22.04.2010 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 07808615 Country of ref document: EP Kind code of ref document: A1 |