WO2014059851A1 - Search server and search method - Google Patents
Search server and search method Download PDFInfo
- Publication number
- WO2014059851A1 WO2014059851A1 PCT/CN2013/083925 CN2013083925W WO2014059851A1 WO 2014059851 A1 WO2014059851 A1 WO 2014059851A1 CN 2013083925 W CN2013083925 W CN 2013083925W WO 2014059851 A1 WO2014059851 A1 WO 2014059851A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- website
- information
- search
- trusted
- web page
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 49
- 238000012545 processing Methods 0.000 claims description 9
- 238000004590 computer program Methods 0.000 claims description 8
- 230000006399 behavior Effects 0.000 claims description 7
- 230000008569 process Effects 0.000 claims description 6
- 238000012544 monitoring process Methods 0.000 claims description 5
- 238000007781 pre-processing Methods 0.000 claims description 5
- 230000008520 organization Effects 0.000 description 5
- 230000008901 benefit Effects 0.000 description 3
- 230000009471 action Effects 0.000 description 2
- 238000012217 deletion Methods 0.000 description 2
- 230000037430 deletion Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 241000239290 Araneae Species 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- ZXQYGBMAQZUVMI-GCMPRSNUSA-N gamma-cyhalothrin Chemical compound CC1(C)[C@@H](\C=C(/Cl)C(F)(F)F)[C@H]1C(=O)O[C@H](C#N)C1=CC=CC(OC=2C=CC=CC=2)=C1 ZXQYGBMAQZUVMI-GCMPRSNUSA-N 0.000 description 1
- 230000008676 import Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
Definitions
- the present invention relates to network search, and more particularly to a search server and a corresponding search method for performing a trusted website search such as an official website through a network based on keywords. Background technique
- search websites use search engines to extract information from various websites (mainly based on web pages) from the Internet and establish a database.
- the search engine can retrieve records that match the user's query criteria.
- the ranking score of each corresponding record in the search result is given, sorted according to the rank of the score, and returned to the user.
- information provided by some highly reliable and well-known websites will be given a higher ranking.
- the higher ranked information is obtained first, which makes it possible for the user to obtain more reliable information.
- search engine optimization SE0
- SE0 search engine optimization
- search engines will count the user's search behavior, a special result display method for the search words frequently searched by users, such as Baidu's Fengchao system, provides users with more accurate and credible information.
- this kind of user search behavior statistics is more feasible for some websites that are more concerned by the public. It is difficult to find some websites that are not very concerned, or relatively professional, with fewer audiences. Official website.
- search engines don't build a special display for most search terms.
- the search engine does not necessarily prioritize official website information when constructing a dedicated display mode. For some official websites that have just been created or have few visits, the search engine has not been specially optimized, and it is difficult for users to obtain information from the official website from the search results.
- the current search engine does not fully consider the importance of the reliability of the official website information to the user. Therefore, when the search result is presented to the user, the information from the official website and other information are not distinguished.
- the present invention has been made in order to provide a search server and a corresponding search method that overcome the above problems or at least partially solve the above problems.
- a search server comprising: an information store, a trusted website store, and a search engine.
- the information storage is adapted to store web page information collected from websites accessing the Internet, the web page information including at least the content of the web page and its URL.
- the trusted website storage is adapted to store website information of a plurality of trusted websites, the website information including at least the website name and the website address.
- the search engine is adapted to receive the search keyword submitted from the user terminal, retrieve the webpage information including the search keyword from the information storage, retrieve the website information corresponding to the search keyword from the trusted website storage, and retrieve the combination Web page information and website information to get search results.
- a search method operates in a search server including an information store and a trusted website storage, wherein the information storage
- the device is adapted to store webpage information collected from websites accessing the Internet, the webpage information including at least the content of the webpage and the URL thereof;
- the trusted website storage is adapted to store the website information of the plurality of trusted websites, the website information at least including the website The name and the URL of the website.
- the search method includes the steps of: receiving a search keyword submitted from a user terminal; retrieving content from the information store including webpage information of the search keyword; and retrieving the website information corresponding to the search keyword from the trusted website storage; The retrieved web page information and website information to obtain search results.
- the trusted website information such as the official website information can be retrieved more accurately, and when the search result is presented, the information from the trusted website can be distinguished from other information, which is convenient for the user. Accurately know the reliable information you need.
- FIG. 1 is a schematic structural diagram of a search server according to an embodiment of the present invention
- FIG. 2 is a flowchart of a search method according to an embodiment of the present invention
- FIG. 3 is a block diagram of a client or server executing a method in accordance with the present invention, in accordance with one embodiment of the present invention
- FIG. 4 is a memory unit for holding or carrying program code implementing a method in accordance with the present invention, in accordance with one embodiment of the present invention. detailed description
- the present invention provides a search server and a search method for searching for a trusted website such as an official website, which will be described in detail below with reference to the accompanying drawings.
- Figure 1 illustrates a search server, including information gathering, in accordance with one embodiment of the present invention.
- the processor 100 the information storage 101, the trusted website storage 110, the trusted website information processor 111, and the search engine 120.
- the user inputs a search keyword through the user terminal 140, searches for a search result marked with a trusted website such as an official website via the search server of the present invention, and presents the search result to the user through the user terminal 140.
- the user terminal may be a computer terminal, or may be a mobile phone, various electronic devices capable of accessing the Internet, or the like.
- the information collecting/processor 100 collects web page information from web servers 1, 2, ... N accessing the Internet, and stores the web page information in the information memory 101.
- the collected web page information includes at least the content of the web page and its URL, and may of course include other content as needed, such as the type of the web page, whether the web page is embedded with a Trojan or the like.
- the information collecting/processor 100 can collect webpage information from various web servers by means of traditional internet information searching methods, such as "spider” and "crawler", and process the obtained webpages, for example, extracting them.
- the keywords, keywords, URLs, IP addresses, and the like, and the processed web pages are stored in the information storage 101.
- the trusted website storage 110 stores website information for a plurality of trusted websites.
- the website information includes at least the website name and the website URL. Of course, it can also include other contents as needed, such as a brief introduction of the website, and some basic information about the organization.
- the trusted website referred to in the present invention means that the information provided by the website should be a website that the public can trust.
- a trusted website may specifically include an official website established by the organization on the network (referred to as the official website), and may of course include other officially recognized websites (such as websites established by some organizations for a specific project). .
- the official website may specifically include an official website established by the organization on the network (referred to as the official website), and may of course include other officially recognized websites (such as websites established by some organizations for a specific project).
- the official website may specifically include an official website established by the organization on the network (referred to as the official website), and may of course include other officially recognized websites (such as websites established by some organizations for a specific project).
- the trusted website for convenience of explanation, some places have been used interchangeably for the official website, the official website, and the trusted website, but their specific meanings refer to the trusted website.
- the name of the website mentioned in the present invention refers to the name of the organization that best reflects the website or provides the website, so the website name can be not only the website title, but also the name of the organization mentioned by the website in the content. , the organization's common names, etc., and the website name can have multiple.
- the trusted site processor 111 collects information from the official website through various reliable data sources and stores it in the trusted website storage 110.
- Data sources for trusted website storage include manual input by administrators, import from trusted sites (for example, the website of the Ministry of Industry and Information Technology), and monitoring of user search behavior (for example, users can be frequently clicked in search results)
- the hit website is determined to be a possible official website, which is then reviewed to determine) and the manual input of the user after registration and so on.
- the official website information stored in the trusted website storage 110 can be stored in a key-value pair (keyword-value pair), wherein the key (keyword) is the official website name, and the value (value) corresponds to the official website name.
- URL a key-value pair
- the trusted site processor 111 determines whether the keyword corresponds to a key stored in the trusted website storage, and if so, returns the official website information, that is, the key is the key
- the word, value is the key-value pair of the URL, otherwise it returns a message indicating that there is no official website.
- the search engine 120 includes a search processor 121.
- the search processor 121 can directly receive the search keyword submitted by the user terminal, retrieve the webpage information containing the search keyword from the information storage 101, and retrieve the website information corresponding to the search keyword from the trusted website storage 110, and combine The retrieved web page information and website information to obtain search results.
- the search processor 121 retrieves the information memory 101 in a conventional search manner based on search keywords input by the user from the terminal to obtain a search result list from the information memory 101.
- the search result list includes one or more search result items, each search result item is each searched webpage information including a search keyword, and the webpage information may be a key-value pair, wherein the key is a URL of the corresponding webpage The address, value is the rank score of the page (for ranking of search results).
- the search processor 121 transmits the search keyword input by the user from the terminal to the trusted site processor 111, if the trusted site processor 111 fails to retrieve the corresponding keyword in the trusted website memory 110.
- the search result is the web page information retrieved from the information storage 101. If the trusted site processor returns the trusted website information, the trusted website information is merged with the web page information retrieved from the information store 101.
- the web page URL in the web page information of a certain search result item corresponds to the website URL in the trusted website information returned from the trusted website processor 111, And deleting the search result item in the search result list, and using the new search result list (ie, new web page information) after the deletion action together with the trusted website information as the final search result.
- the web page URL and trusted network in the webpage information mentioned above does not mean that the two are completely consistent.
- the URL of a website in the official website usually only includes the host name, not the path and file name after the host name.
- Web page URLs usually include host names, paths, and file names.
- the URL of the webpage corresponds to the URL of the website, that is, the host name part of the two URLs is the same, or the root name of the host name is the same.
- the website URL of the official website is www. aaa. com? and the web page URL can be www. aaa. com /b/c. html , which can be considered as corresponding.
- the official website URL is aaa.com
- the webpage URL can be www.aa.com/b/c.html, which can also be considered as corresponding.
- a search preprocessor 122 can also be provided in the search engine 120.
- the preprocessor 122 is a conventional search engine component adapted to preprocess the search terms entered by the user, to eliminate common words, and to adjust some of the search terms to generate search terms that the search engine considers appropriate.
- the preprocessor 122 modifies the search keywords to words that are consistent with the trusted website name stored in the trusted website storage 110. For example, when the search keyword input by the user is "Tuen Mun Hospital", the preprocessor 122 automatically adjusts the keyword to a more accurate "Qianmenli Hospital" when preprocessing the keyword.
- the user-submitted search keywords are pre-processed by the pre-processor 122 to provide more efficient keywords to the search processor 121 for more efficient retrieval in the information store 101 and the trusted website store 110.
- the search server also includes a result processor 130.
- the result processor 130 processes the search results from the search processor 121 and presents them to the user terminal 140, wherein processing the search results includes processing the website information in a significant manner when the search results are trusted website information and new web page information.
- the method for processing the website information in a significant manner may be, when displaying on the user terminal 140, adding a V or other trusted mark on the title of the trusted website such as the official website; or dividing the page in the dividing line.
- the upper part shows the trusted website information, and the other search results are displayed in the lower part of the dividing line; or the way to highlight the trusted website information.
- the search processor in accordance with the present invention provides a reliable trusted website information search for users based on trusted website storage and search engines. Moreover, using the search result processor, the reliable trusted website information searched in the trusted website storage is displayed in a manner different from the web page information searched in the information storage, providing the user with significant and reliable trust. Website information.
- the search The method is adapted to operate in the search server shown in FIG. 1.
- the method starts in step S210, in which the search keyword submitted from the user terminal is received.
- the method may further
- the search keywords are pre-processed to generate more accurate keywords for the search processor, and the pre-processing includes culling common words such as the virtual word "of" in the search keyword, and/or modifying the keyword Obvious error in the middle.
- step S220 the webpage information including the search keyword is retrieved from the information storage 101, and the website information corresponding to the search keyword of the website name is retrieved from the trusted website storage 110.
- the information memory 101 is retrieved in a conventional search manner based on search keywords input by the user from the terminal to obtain a search result list from the information memory 101.
- the search result list includes one or more search result items, each search result item is each searched webpage information including a search keyword, and the webpage information may be a key-value pair, wherein the key is a URL of the corresponding webpage The address, value is the rank score of the page (for ranking of search results).
- the official website information stored in the trusted website storage 110 can be stored in a key-value pair (keyword-value pair), wherein the key (keyword) is the official website name, and the value (value) is this The URL corresponding to the official website name.
- the trusted site processor 111 retrieves in the trusted website memory 110 whether the key of the official website information corresponds to the search keyword. Optionally, this step is accomplished by the search engine's search processor 121 or by the search processor 121 via the trusted website information processor 111.
- next step S230 it is determined whether to return the website information; specifically, if the key in the trusted website memory 110 that retrieves the official website information corresponds to the search keyword, the official website information is returned, that is, the key is the keyword. , value is the key-value pair of the URL, otherwise it returns a message indicating that there is no official website.
- this step is performed by the trusted website information processor 111.
- step S240 the content corresponding to the website information is deleted from the webpage information to obtain new webpage information, and the website information and the new webpage information are combined to generate a final search.
- the result is provided to the result processor 130; for example, in the search result list retrieved from the information memory 101, the web page URL in the web page information of a certain search result item and the trusted website returned from the trusted website processor 111; Corresponding to the website URL in the information, the search result item in the search result list is deleted, and the new search result list (ie, new web page information) after the above deletion action is together with the trusted website information as the last Search results.
- the web page information obtained in step S220 is directly used as the final search result.
- step S250 the search result is processed and returned to the user terminal, wherein when the search result is the website information and the new webpage information, the website information is processed in a significant manner, preferably in the Adding a trusted logo to the website name of the website information, or placing the website information before the new web page information and distinguishing them in a highlight manner or a dividing line.
- this step is performed by result processor 130.
- the search method according to the present invention further comprises using a trusted website information processor to process and store the website information obtained in a trusted manner in the trusted website storage, wherein the trusted manner
- the obtained website information includes at least one or more of the website information imported from the trusted web site, the website information manually input, and the website information obtained from the monitoring of the user's search behavior.
- the trusted website storage/trusted website information processor and the existing search engine are perfectly integrated, and the information of the trusted website such as the official website is accurately searched. And presented to the user in a significant way, enabling users to more reliably obtain reliable search results.
- DSP DSP
- the invention can also be implemented as a device or device program (e.g., a computer program and a computer program product) for performing some or all of the methods described herein.
- a program implementing the invention may be stored on a computer readable medium or may be in the form of one or more signals. Such signals may be downloaded from an Internet website, provided on a carrier signal, or provided in any other form.
- Figure 3 illustrates a computing device in which the method of the present invention may be implemented, where the computing device may be a client or server capable of implementing the methods of the present invention.
- the client or server conventionally includes a processor 310 and a computer program product in the form of a memory 320 or Computer readable medium.
- the memory 320 may be an electronic memory such as a flash memory, an EEPROM (Electrically Erasable Programmable Read Only Memory), an EPROM, a hard disk, or a ROM.
- Memory 320 has a memory space 330 for program code 331 for performing any of the method steps described above.
- storage space 330 for program code may include various program code 331 for implementing various steps in the above methods, respectively.
- the program code can be read from or written to one or more computer program products.
- These computer program products include program code carriers such as hard disks, compact disks (CDs), memory cards or floppy disks.
- Such a computer program product is typically a portable or fixed storage unit as described with reference to FIG.
- the storage unit may have a storage segment, a storage space, and the like arranged similarly to the storage 320 in the client or server of FIG.
- the program code can be compressed, for example, in an appropriate form.
- the storage unit includes computer readable code 33, i.e., code that can be read by a processor, such as 310, which, when executed by a server, causes the server to perform various steps in the methods described above.
- an embodiment or “one or more embodiments” as used herein means that the particular features, structures, or characteristics described in connection with the embodiments are included in at least one embodiment of the invention.
- the phrase “in one embodiment” herein does not necessarily refer to the same embodiment.
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Information Transfer Between Computers (AREA)
Abstract
A search server is disclosed. Said server is provided with a trusted-website memory that stores website information of trusted websites, said information comprising at least website names and website addresses. When a user uses said network search server to conduct a website information search, the server can provide to the user trusted search results. Moreover, a search results processor is used to display on the user terminal the trusted-website information found in the trusted-website memory in such a manner as to distinguish said information from other search results. A corresponding search method is also disclosed.
Description
一种搜索服务器及搜索方法 技术领域 Search server and search method
本发明涉及网络搜索, 尤其涉及一种根据关键词通过网络进行诸 如官网的可信网站搜索的搜索服务器及相应的搜索方法。 背景技术 The present invention relates to network search, and more particularly to a search server and a corresponding search method for performing a trusted website search such as an official website through a network based on keywords. Background technique
随着互联网的高速发展, 各种企业、 组织和个人等逐步了解到在 互联网上提供信息服务的重要性而纷纷建立各自的网站来发布相应信 息。 随着在网络上提供网络信息服务的网站日益增多, 互联网用户很 难记住所有的这些网站甚至是想访问的网站的具体地址。 与此同时, 互联网所容纳的信息也在呈爆炸式增长, 发展到今天, 互联网上的各 种信息, 可谓是浩如烟海。 在这种情况下, 如何让互联网用户在最短 的时间内访问到自己想要访问的网站或者是找到自己想要的信息, 成 为了当务之急。 于是, 有别于开始的发布各种消息的网站, 一类专事 搜索的网站、 服务器应运而生。 而基于互联网的搜索网站以及衍生出 的各种搜索方式, 也极大的推动了互联网的发展。 目前, 互联网用户 在很大程度上都依赖于搜索网站来查询这些网站位置, 获取自己所需 的信息。 With the rapid development of the Internet, various enterprises, organizations and individuals have gradually learned the importance of providing information services on the Internet and have established their own websites to publish corresponding information. With the increasing number of websites that provide network information services on the Internet, it is difficult for Internet users to remember all of these websites and even the specific addresses of the websites they want to visit. At the same time, the information contained in the Internet is exploding, and today, the various information on the Internet can be described as vast. In this case, how to make Internet users access the website they want to visit or find the information they want in the shortest time is a priority. Therefore, unlike the website that started to publish various news, a type of website and server for special search came into being. Internet-based search sites and various search methods have also greatly promoted the development of the Internet. Currently, Internet users rely heavily on search sites to find out where they are and get the information they need.
一般而言, 搜索网站利用搜索引擎来从互联网中提取各个网站的 信息 (以网页文字为主) , 建立起数据库。 当用户在搜索网站上进行 查询时, 搜索引擎能检索与用户查询条件相匹配的记录。 按照搜索结 果与查询条件相匹配的程度, 给出搜索结果中每条相应记录的排名 score , 按照排名 score的高低来排序并返回给用户。 正常情况下, 一 些可靠性高、 比较知名的网站所提供的信息会被给予较高的排名。 当 用户进行搜索时, 会首先获得排名较高的信息, 从而使得用户有可能 获得较为可靠的信息。 In general, search websites use search engines to extract information from various websites (mainly based on web pages) from the Internet and establish a database. When a user makes a query on a search site, the search engine can retrieve records that match the user's query criteria. According to the degree to which the search result matches the query condition, the ranking score of each corresponding record in the search result is given, sorted according to the rank of the score, and returned to the user. Under normal circumstances, information provided by some highly reliable and well-known websites will be given a higher ranking. When the user searches, the higher ranked information is obtained first, which makes it possible for the user to obtain more reliable information.
然而, 出于广告、 点击率等商业因素的考虑, 一些服务提供商会 针对各个现有搜索引擎的排名算法进行搜索引擎优化 (SE0 ) , 使得自 己的信息会被给予较高排名, 这些信息有可能是不正确甚至是恶意的。 用户无法直观地判断自己查询获得的信息是否可信。
另外, 虽然搜索引擎会收集一些公司的官方网站, 但是在呈现搜 索结果时, 用户并不知晓所获得的结果是否来自官方网站。 当用户搜 索词和这些公司的名称稍微不同时, 如果有服务提供商针对这些搜索 词进行了优化, 甚至这些提供商的排名会在这些官方网站之前。 However, due to commercial factors such as advertising and click-through rate, some service providers will perform search engine optimization (SE0) on the ranking algorithms of each existing search engine, so that their information will be given a higher ranking. It is incorrect or even malicious. Users cannot intuitively determine whether the information they have obtained is trusted. In addition, although search engines collect official websites of some companies, when presenting search results, users are not aware of whether the results obtained are from official websites. When the user search term is slightly different from the names of these companies, if a service provider has optimized these search terms, even those providers will rank before these official websites.
虽然一些搜索引擎会统计用户的搜索行为, 针对用户经常搜索的 搜索词构建专门的结果显示方式, 例如百度的凤巢系统等, 为用户提 供更为准确可信的信息。 然而, 这种用户搜索行为的统计, 对一些受 大众关注程度比较高的网站较为可行, 对一些不怎么受关注, 或者说 相对比较专业的、 受众较少的网站, 则很难找到其相应的官方网站。 因为搜索引擎并不会为大多数的搜索词构建专门显示方式。 而且, 搜 索引擎在构建专门的显示方式时, 并不一定以官网信息优先。 对于一 些刚刚创建、 或者访问量较少的官网, 搜索引擎并没有进行特别优化, 用户难以从搜索结果中获得来自官网的信息。 Although some search engines will count the user's search behavior, a special result display method for the search words frequently searched by users, such as Baidu's Fengchao system, provides users with more accurate and credible information. However, this kind of user search behavior statistics is more feasible for some websites that are more concerned by the public. It is difficult to find some websites that are not very concerned, or relatively professional, with fewer audiences. Official website. Because search engines don't build a special display for most search terms. Moreover, the search engine does not necessarily prioritize official website information when constructing a dedicated display mode. For some official websites that have just been created or have few visits, the search engine has not been specially optimized, and it is difficult for users to obtain information from the official website from the search results.
由此可知, 目前的搜索引擎并没有充分考虑官网信息的可靠性对 于用户的重要程度, 因此在为用户呈现搜索结果时, 也没有对来自官 网的信息和其它信息进行区分。 It can be seen that the current search engine does not fully consider the importance of the reliability of the official website information to the user. Therefore, when the search result is presented to the user, the information from the official website and other information are not distinguished.
因此需要一种新的搜索排序和搜索结果呈现方式, 在搜索结果中 以较高的排名和显著的方式来显示来自诸如官网的可信网站的信息。 发明内容 There is therefore a need for a new search ranking and search result presentation that displays information from trusted websites such as the official website in a higher ranking and significant manner in the search results. Summary of the invention
鉴于上述问题, 提出了本发明以便提供一种克服上述问题或者至 少部分地解决上述问题的搜索服务器和相应的搜索方法。 In view of the above problems, the present invention has been made in order to provide a search server and a corresponding search method that overcome the above problems or at least partially solve the above problems.
根据本发明的一个方面, 提供了一种搜索服务器, 该搜索服务器 包括: 信息存储器、 可信网站存储器和搜索引擎。 信息存储器适于存 储从接入互联网的各网站中收集的网页信息, 该网页信息至少包括网 页的内容及其 URL。 可信网站存储器适于存储多个可信网站的网站信 息, 网站信息至少包括网站名称以及网站的网址。 搜索引擎适于接收 从用户终端提交的搜索关键词, 从信息存储器中检索内容包括搜索关 键词的网页信息, 从可信网站存储器检索网站名称与搜索关键词相对 应的网站信息, 组合所检索到的网页信息和网站信息以获得搜索结果。 According to an aspect of the present invention, a search server is provided, the search server comprising: an information store, a trusted website store, and a search engine. The information storage is adapted to store web page information collected from websites accessing the Internet, the web page information including at least the content of the web page and its URL. The trusted website storage is adapted to store website information of a plurality of trusted websites, the website information including at least the website name and the website address. The search engine is adapted to receive the search keyword submitted from the user terminal, retrieve the webpage information including the search keyword from the information storage, retrieve the website information corresponding to the search keyword from the trusted website storage, and retrieve the combination Web page information and website information to get search results.
根据本发明的另一个方面, 提供一种搜索方法, 该搜索方法在包 括信息存储器和可信网站存储器的搜索服务器中运行, 其中信息存储
器适于存储从接入互联网的各网站中收集的网页信息, 该网页信息至 少包括网页的内容及其 URL;可信网站存储器适于存储多个可信网站的 网站信息, 网站信息至少包括网站名称以及网站的网址。 According to another aspect of the present invention, a search method is provided that operates in a search server including an information store and a trusted website storage, wherein the information storage The device is adapted to store webpage information collected from websites accessing the Internet, the webpage information including at least the content of the webpage and the URL thereof; the trusted website storage is adapted to store the website information of the plurality of trusted websites, the website information at least including the website The name and the URL of the website.
该搜索方法包括如下步骤: 接收从用户终端提交的搜索关键词; 从信息存储器中检索内容包括搜索关键词的网页信息; 从可信网站存 储器检索网站名称与搜索关键词相对应的网站信息; 组合所检索到的 网页信息和网站信息以获得搜索结果。 The search method includes the steps of: receiving a search keyword submitted from a user terminal; retrieving content from the information store including webpage information of the search keyword; and retrieving the website information corresponding to the search keyword from the trusted website storage; The retrieved web page information and website information to obtain search results.
通过本发明提供的搜索处理器和搜索方法, 能够更准确地检索诸 如官网信息的可信网站信息, 并且在呈现搜索结果时, 能够对来自可 信网站的信息与其他信息进行区分, 方便了用户准确获知其所需要的 可靠信息。 Through the search processor and the search method provided by the invention, the trusted website information such as the official website information can be retrieved more accurately, and when the search result is presented, the information from the trusted website can be distinguished from other information, which is convenient for the user. Accurately know the reliable information you need.
上述说明仅是本发明技术方案的概述, 为了能够更清楚了解本发 明的技术手段, 而可依照说明书的内容予以实施, 并且为了让本发明 的上述和其它目的、 特征和优点能够更明显易懂, 以下特举本发明的 具体实施方式。 附图说明 The above description is only an overview of the technical solutions of the present invention, and the technical means of the present invention can be more clearly understood, and can be implemented in accordance with the contents of the specification, and the above and other objects, features and advantages of the present invention can be more clearly understood. Specific embodiments of the invention are set forth below. DRAWINGS
通过阅读下文优选实施方式的详细描述, 各种其他的优点和益处 对于本领域普通技术人员将变得清楚明了。 附图仅用于示出具体实施 方式的目的, 而并不认为是对本发明的限制。 而且在整个附图中, 用 相同的参考符号表示相同的部件。 在附图中: Various other advantages and benefits will become apparent to those skilled in the art from a The drawings are only for the purpose of illustrating the specific embodiments and are not to be construed as limiting. Throughout the drawings, the same reference numerals are used to refer to the same parts. In the drawing:
图 1为根据本发明一个实施例的搜索服务器的结构示意图; 图 2为根据本发明一个实施例的搜索方法的流程图; 1 is a schematic structural diagram of a search server according to an embodiment of the present invention; FIG. 2 is a flowchart of a search method according to an embodiment of the present invention;
图 3 为根据本发明一个实施例的执行根据本发明的方法的客户端 或服务器的框图; 以及 3 is a block diagram of a client or server executing a method in accordance with the present invention, in accordance with one embodiment of the present invention;
图 4为根据本发明一个实施例的用于保持或者携带实现根据本发明 的方法的程序代码的存储单元。 具体实施方式 4 is a memory unit for holding or carrying program code implementing a method in accordance with the present invention, in accordance with one embodiment of the present invention. detailed description
本发明提供了一种针对诸如官网的可信网站进行搜索的搜索服务 器和搜索方法, 下面将结合附图详细说明如下。 The present invention provides a search server and a search method for searching for a trusted website such as an official website, which will be described in detail below with reference to the accompanying drawings.
图 1示出了根据本发明一个实施例的搜索服务器, 包括信息收集 /
处理器 100、 信息存储器 101、 可信网站存储器 110、 可信网站信息处 理器 111, 和搜索引擎 120。 用户通过用户终端 140输入搜索关键词, 经由本发明的搜索服务器, 搜索到标注有诸如官网之类的可信网站的 搜索结果, 并再通过用户终端 140 呈现给用户。 在本发明中, 用户终 端可以是计算机终端, 也可以是手机、 能接入互联网的各种电子设备 等。 Figure 1 illustrates a search server, including information gathering, in accordance with one embodiment of the present invention. The processor 100, the information storage 101, the trusted website storage 110, the trusted website information processor 111, and the search engine 120. The user inputs a search keyword through the user terminal 140, searches for a search result marked with a trusted website such as an official website via the search server of the present invention, and presents the search result to the user through the user terminal 140. In the present invention, the user terminal may be a computer terminal, or may be a mobile phone, various electronic devices capable of accessing the Internet, or the like.
信息收集 /处理器 100从接入互联网的各网站服务器 1、 2…… N中 收集网页信息, 并且将网页信息存储入信息存储器 101 中。 收集的网 页信息至少包括网页的内容及其 URL,当然还可以根据需要包括其它内 容,例如网页的类型,网页是否被嵌入了木马等。信息收集 /处理器 100 从各网站服务器中收集网页信息的方式可以是传统的互联网信息搜索 方法, 譬如 "蜘蛛" 、 "爬虫" 等方式来获得, 并且对所获得的网页 进行处理, 例如提取其中的主题词、 关键词、 URL、 IP地址等等, 并且 将处理后的网页存储在信息存储器 101中。 The information collecting/processor 100 collects web page information from web servers 1, 2, ... N accessing the Internet, and stores the web page information in the information memory 101. The collected web page information includes at least the content of the web page and its URL, and may of course include other content as needed, such as the type of the web page, whether the web page is embedded with a Trojan or the like. The information collecting/processor 100 can collect webpage information from various web servers by means of traditional internet information searching methods, such as "spider" and "crawler", and process the obtained webpages, for example, extracting them. The keywords, keywords, URLs, IP addresses, and the like, and the processed web pages are stored in the information storage 101.
可信网站存储器 110 存储了多个可信网站的网站信息。 网站信息 至少包括网站名称以及网站的网址, 当然根据需要还可以包括其它内 容, 例如网站的简要介绍, 一些网站相对应组织的基本情况信息等。 The trusted website storage 110 stores website information for a plurality of trusted websites. The website information includes at least the website name and the website URL. Of course, it can also include other contents as needed, such as a brief introduction of the website, and some basic information about the organization.
应当注意的是, 本发明中所提及的可信网站是指该网站所提供的 信息应当是公众可以信任的网站。 这种可信网站具体而言可以包括一 些组织在网络上建立的官方网站 (简称为官网) , 此外当然还可以包 括一些官方认可的其它网站 (例如一些组织为了某个专项而单独建立 的网站) 。 在本申请中, 为了便于说明, 有些地方对官网、 官方网站 和可信网站进行了可以相互替换的使用, 但它们的具体意思都是指可 信网站。 It should be noted that the trusted website referred to in the present invention means that the information provided by the website should be a website that the public can trust. Such a trusted website may specifically include an official website established by the organization on the network (referred to as the official website), and may of course include other officially recognized websites (such as websites established by some organizations for a specific project). . In the present application, for convenience of explanation, some places have been used interchangeably for the official website, the official website, and the trusted website, but their specific meanings refer to the trusted website.
还应当注意的是, 本发明中所提及的网站名称是指最能够反映网 站或者提供网站的组织的名称, 因此网站名称不仅可以是网站标题, 还可以是网站在内容中提及的组织名称、 组织的民间常用名称等, 而 且网站名称可以有多个。 It should also be noted that the name of the website mentioned in the present invention refers to the name of the organization that best reflects the website or provides the website, so the website name can be not only the website title, but also the name of the organization mentioned by the website in the content. , the organization's common names, etc., and the website name can have multiple.
可信站点处理器 111 通过各种可靠的数据来源采集官方网站的信 息, 并将其存储于可信网站存储器 110 中。 可信网站存储器的数据来 源包括管理员手工输入、 来自可信站点的导入 (例如, 工信部的网站) 以及对用户搜索行为的监控 (例如, 可以将搜索结果中被用户频繁点
击的网站确定为可能的官网, 随后审核来确定) 以及注册认证后的用 户的手工输入等等。 The trusted site processor 111 collects information from the official website through various reliable data sources and stores it in the trusted website storage 110. Data sources for trusted website storage include manual input by administrators, import from trusted sites (for example, the website of the Ministry of Industry and Information Technology), and monitoring of user search behavior (for example, users can be frequently clicked in search results) The hit website is determined to be a possible official website, which is then reviewed to determine) and the manual input of the user after registration and so on.
可信网站存储器 110中所存储的官网信息可以采用 key-value对 (关键字-值对) 的方式进行存储, 其中 key (关键字)就是官网名称, value (值) 就是和这个官网名称相对应的 URL。 可信站点处理器 111 在接收到来自搜索处理器 121 的搜索关键词时, 确定该关键词是否与 存储在可信网站存储器中的 key 相对应, 如果是则返回官网信息, 即 key为该关键词, value为 URL的 key-value对, 否则返回指示不存在 官网的消息。 The official website information stored in the trusted website storage 110 can be stored in a key-value pair (keyword-value pair), wherein the key (keyword) is the official website name, and the value (value) corresponds to the official website name. URL. Upon receiving the search keyword from the search processor 121, the trusted site processor 111 determines whether the keyword corresponds to a key stored in the trusted website storage, and if so, returns the official website information, that is, the key is the key The word, value is the key-value pair of the URL, otherwise it returns a message indicating that there is no official website.
搜索引擎 120包括搜索处理器 121。搜索处理器 121可直接接收用 户终端提交的搜索关键词, 从信息存储器 101 中检索含有搜索关键词 的网页信息, 同时从可信网站存储器 110 检索网站名称与搜索关键词 相对应的网站信息, 组合所检索到的网页信息和网站信息以获得搜索 结果。 The search engine 120 includes a search processor 121. The search processor 121 can directly receive the search keyword submitted by the user terminal, retrieve the webpage information containing the search keyword from the information storage 101, and retrieve the website information corresponding to the search keyword from the trusted website storage 110, and combine The retrieved web page information and website information to obtain search results.
一方面, 搜索处理器 121 基于用户从终端输入的搜索关键词以传 统搜索方式对信息存储器 101进行检索, 以从信息存储器 101 中获取 搜索结果列表。 搜索结果列表包括一个或多个搜索结果项, 每个搜索 结果项为每条被搜索到的包括搜索关键词的网页信息, 所述网页信息 可以是 key-value对, 其中 key是相应网页的 URL地址, value是所述 网页的排名 score (用于搜索结果排名) 。 In one aspect, the search processor 121 retrieves the information memory 101 in a conventional search manner based on search keywords input by the user from the terminal to obtain a search result list from the information memory 101. The search result list includes one or more search result items, each search result item is each searched webpage information including a search keyword, and the webpage information may be a key-value pair, wherein the key is a URL of the corresponding webpage The address, value is the rank score of the page (for ranking of search results).
另一方面, 搜索处理器 121 将用户从终端输入的搜索关键词发送 给可信站点处理器 111,如果可信站点处理器 111未能在可信网站存储 器 110 中检索到含有相应关键词的可信网站信息, 则搜索结果即为从 信息存储器 101 中检索到的网页信息。 如果可信站点处理器返回了可 信网站信息, 则将可信网站信息和从信息存储器 101 中所检索到的网 页信息进行合并处理。 例如, 在从信息存储器 101 中所检索到的搜索 结果列表中, 某一搜索结果项的网页信息中的网页 URL 与从可信网站 处理器 111返回的可信网站信息中的网站 URL相对应, 则将搜索结果 列表中的所述搜索结果项删除, 并且将上述删除动作之后的新搜索结 果列表 (即新的网页信息) 和所述可信网站信息一同作为最后的搜索 结果。 On the other hand, the search processor 121 transmits the search keyword input by the user from the terminal to the trusted site processor 111, if the trusted site processor 111 fails to retrieve the corresponding keyword in the trusted website memory 110. In the case of the website information, the search result is the web page information retrieved from the information storage 101. If the trusted site processor returns the trusted website information, the trusted website information is merged with the web page information retrieved from the information store 101. For example, in the search result list retrieved from the information storage 101, the web page URL in the web page information of a certain search result item corresponds to the website URL in the trusted website information returned from the trusted website processor 111, And deleting the search result item in the search result list, and using the new search result list (ie, new web page information) after the deletion action together with the trusted website information as the final search result.
应当注意的是, 上文提及的有关网页信息中的网页 URL和可信网
站信息中的网站 URL相对应并不是指二者完全一致。 一般而言, 官网 中的网站 URL通常仅仅包括主机名称, 而不包含主机名称之后的路径 和文件名。 而网页 URL 则通常会包括主机名称、 路径和文件名等。 本 发明中上述网页 URL与网站 URL相对应是指两个 URL中的主机名称部 分相同, 或者主机名称中的根站点名称一致。 例如官网的网站 URL为 www. aaa. com? 而网页 URL可以是 www. aaa. com /b/c. html , 二者可以认为是相 对应的。 又例如, 官网的网站 URL 为 aaa. com, 而网页 URL 可以是 www. aaa. com/b/c. html, 这二者也可以认为是相对应的。 It should be noted that the web page URL and trusted network in the webpage information mentioned above The corresponding website URL in the station information does not mean that the two are completely consistent. In general, the URL of a website in the official website usually only includes the host name, not the path and file name after the host name. Web page URLs usually include host names, paths, and file names. In the present invention, the URL of the webpage corresponds to the URL of the website, that is, the host name part of the two URLs is the same, or the root name of the host name is the same. For example, the website URL of the official website is www. aaa. com? and the web page URL can be www. aaa. com /b/c. html , which can be considered as corresponding. For another example, the official website URL is aaa.com, and the webpage URL can be www.aa.com/b/c.html, which can also be considered as corresponding.
可选地, 在搜索引擎 120中还可设置搜索预处理器 122。预处理器 122是传统的搜索引擎部件, 适于对用户输入的搜索词进行预处理, 剔 除掉常用的词, 并对一些搜索词进行调整, 从而生成搜索引擎认为恰 当的搜索词。 尤其是当用户输入的关键词接近可信网站名称时, 预处 理器 122将这些搜索关键词修改为与可信网站存储器 110中所存储的 与可信网站名称一致的词。 譬如当用户输入的搜索关键词是 "蓟门医 院" , 预处理器 122 在对关键词进行预处理时, 自动将其调整为更为 准确的 "蓟门里医院" 。 用户提交的搜索关键词经预处理器 122 预处 理后, 将更为有效的关键词, 并提供给搜索处理器 121, 以便在信息存 储器 101以及可信网站存储器 110中进行更为有效的检索。 Optionally, a search preprocessor 122 can also be provided in the search engine 120. The preprocessor 122 is a conventional search engine component adapted to preprocess the search terms entered by the user, to eliminate common words, and to adjust some of the search terms to generate search terms that the search engine considers appropriate. In particular, when the keywords entered by the user are close to the trusted website name, the preprocessor 122 modifies the search keywords to words that are consistent with the trusted website name stored in the trusted website storage 110. For example, when the search keyword input by the user is "Tuen Mun Hospital", the preprocessor 122 automatically adjusts the keyword to a more accurate "Qianmenli Hospital" when preprocessing the keyword. The user-submitted search keywords are pre-processed by the pre-processor 122 to provide more efficient keywords to the search processor 121 for more efficient retrieval in the information store 101 and the trusted website store 110.
如图 1所示, 搜索服务器还包括结果处理器 130。 结果处理器 130 处理来自搜索处理器 121的搜索结果并呈现给用户终端 140,其中处理 搜索结果包括在搜索结果为可信网站信息和新的网页信息时, 以显著 的方式处理网站信息。 其中, 以显著的方式处理所述网站信息的方式 可以是在用户终端 140上显示时,在诸如官网的可信网站的标题上加 V 或者其他可信标志; 或者将页面进行分割, 在分割线的上部显示可信 网站信息, 而在分割线的下部分显示其他搜索结果; 或者是高亮显示 可信网站信息等方式。 As shown in FIG. 1, the search server also includes a result processor 130. The result processor 130 processes the search results from the search processor 121 and presents them to the user terminal 140, wherein processing the search results includes processing the website information in a significant manner when the search results are trusted website information and new web page information. The method for processing the website information in a significant manner may be, when displaying on the user terminal 140, adding a V or other trusted mark on the title of the trusted website such as the official website; or dividing the page in the dividing line. The upper part shows the trusted website information, and the other search results are displayed in the lower part of the dividing line; or the way to highlight the trusted website information.
综上所述, 根据本发明的搜索处理器基于可信网站存储器和搜索 引擎, 为用户提供可靠的可信网站信息搜索。 而且, 采用搜索结果处 理器, 将在可信网站存储器中搜索到的可靠的可信网站信息以有别于 在信息存储器中搜索到的网页信息的方式显示为用户提供了显著且可 靠的可信网站信息。 In summary, the search processor in accordance with the present invention provides a reliable trusted website information search for users based on trusted website storage and search engines. Moreover, using the search result processor, the reliable trusted website information searched in the trusted website storage is displayed in a manner different from the web page information searched in the information storage, providing the user with significant and reliable trust. Website information.
图 2 示出了根据本发明一个实施例的搜索方法的流程图。 该搜索
方法适于在图 1所示的搜索服务器中运行, 该方法始于步骤 S210, 其 中接收从用户终端提交的搜索关键词, 优选地, 在步骤 S210中接收了 搜索关键词之后, 还可以对该搜索关键词进行预处理以生成对搜索处 理器而言更准确的关键词, 预处理包括剔除掉所述搜索关键词中诸如 虚词 "的"之类的常用词, 和 /或修改所述关键词中的明显错误。 2 shows a flow chart of a search method in accordance with one embodiment of the present invention. The search The method is adapted to operate in the search server shown in FIG. 1. The method starts in step S210, in which the search keyword submitted from the user terminal is received. Preferably, after receiving the search keyword in step S210, the method may further The search keywords are pre-processed to generate more accurate keywords for the search processor, and the pre-processing includes culling common words such as the virtual word "of" in the search keyword, and/or modifying the keyword Obvious error in the middle.
随后在步骤 S220, 从信息存储器 101中检索内容包括搜索关键词 的网页信息, 并从可信网站存储器 110 中检索网站名称与搜索关键词 相对应的网站信息。 一方面, 基于用户从终端输入的搜索关键词以传 统搜索方式对信息存储器 101进行检索, 以从信息存储器 101 中获取 搜索结果列表。 搜索结果列表包括一个或多个搜索结果项, 每个搜索 结果项为每条被搜索到的包括搜索关键词的网页信息, 所述网页信息 可以是 key-value对, 其中 key是相应网页的 URL地址, value是所述 网页的排名 score (用于搜索结果排名) 。 另一方面, 可信网站存储器 110中所存储的官网信息可以采用 key-value对 (关键字-值对) 的方 式进行存储, 其中 key (关键字) 就是官网名称, value (值) 就是和 这个官网名称相对应的 URL。可信站点处理器 111在接收到所述搜索关 键词时, 在可信网站存储器 110中检索是否有官网信息的 key与搜索 关键词相对应。 可选地, 这个步骤是通过搜索引擎的搜索处理器 121 完成的, 或由搜索处理器 121经由可信网站信息处理器 111来完成的。 Then, in step S220, the webpage information including the search keyword is retrieved from the information storage 101, and the website information corresponding to the search keyword of the website name is retrieved from the trusted website storage 110. In one aspect, the information memory 101 is retrieved in a conventional search manner based on search keywords input by the user from the terminal to obtain a search result list from the information memory 101. The search result list includes one or more search result items, each search result item is each searched webpage information including a search keyword, and the webpage information may be a key-value pair, wherein the key is a URL of the corresponding webpage The address, value is the rank score of the page (for ranking of search results). On the other hand, the official website information stored in the trusted website storage 110 can be stored in a key-value pair (keyword-value pair), wherein the key (keyword) is the official website name, and the value (value) is this The URL corresponding to the official website name. Upon receiving the search keyword, the trusted site processor 111 retrieves in the trusted website memory 110 whether the key of the official website information corresponds to the search keyword. Optionally, this step is accomplished by the search engine's search processor 121 or by the search processor 121 via the trusted website information processor 111.
在接下来的步骤 S230, 判断是否返回网站信息; 具体地, 如果在 可信网站存储器 110中检索到存在有官网信息的 key与搜索关键词相 对应,则返回官网信息,即 key为该关键词、 value为 URL的 key-value 对, 否则返回指示不存在官网的消息。 可选地, 这个步骤是由可信网 站信息处理器 111来完成的。 In the next step S230, it is determined whether to return the website information; specifically, if the key in the trusted website memory 110 that retrieves the official website information corresponds to the search keyword, the official website information is returned, that is, the key is the keyword. , value is the key-value pair of the URL, otherwise it returns a message indicating that there is no official website. Optionally, this step is performed by the trusted website information processor 111.
若返回有网站信息, 随后在步骤 S240, 从所述网页信息中删除与 所述网站信息相对应的内容以获得新的网页信息, 组合所述网站信息 和所述新的网页信息生成最后的搜索结果提供给结果处理器 130 ; 例 如, 在从信息存储器 101 中所检索到的搜索结果列表中, 某一搜索结 果项的网页信息中的网页 URL与从可信网站处理器 111返回的可信网 站信息中的网站 URL相对应, 则将搜索结果列表中的所述搜索结果项 删除, 并且将上述删除动作之后的新搜索结果列表 (即新的网页信息) 和所述可信网站信息一同作为最后的搜索结果。 应当注意的是, 上文
提及的有关网页信息中的网页 URL和可信网站信息中的网站 URL相对 应并不是指二者完全一致, 而是指两个 URL 中的主机名称部分相同, 或者主机名称中的根站点名称一致。 可选地, 这个步骤是由搜索处理 器 121完成的。 If the website information is returned, then in step S240, the content corresponding to the website information is deleted from the webpage information to obtain new webpage information, and the website information and the new webpage information are combined to generate a final search. The result is provided to the result processor 130; for example, in the search result list retrieved from the information memory 101, the web page URL in the web page information of a certain search result item and the trusted website returned from the trusted website processor 111; Corresponding to the website URL in the information, the search result item in the search result list is deleted, and the new search result list (ie, new web page information) after the above deletion action is together with the trusted website information as the last Search results. It should be noted that the above The corresponding web page URL in the web page information and the website URL in the trusted website information do not mean that they are completely identical, but the host name part in the two URLs is the same, or the root name in the host name. Consistent. Optionally, this step is performed by the search processor 121.
若未返回网站信息, 直接将步骤 S220得到的网页信息作为最后的 搜索结果。 If the website information is not returned, the web page information obtained in step S220 is directly used as the final search result.
随后在步骤 S250中, 处理所述搜索结果并返回给用户终端, 其中 在所述搜索结果为所述网站信息和所述新的网页信息时, 以显著的方 式处理所述网站信息, 优选地在所述网站信息的网站名称上添加可信 标志, 或将所述网站信息放置在所述新的网页信息之前并以高亮方式 或分割线区分二者。 可选地, 这个步骤是由结果处理器 130完成的。 Then in step S250, the search result is processed and returned to the user terminal, wherein when the search result is the website information and the new webpage information, the website information is processed in a significant manner, preferably in the Adding a trusted logo to the website name of the website information, or placing the website information before the new web page information and distinguishing them in a highlight manner or a dividing line. Optionally, this step is performed by result processor 130.
可选地, 根据本发明的搜索方法还包括采用可信网站信息处理器 来处理经可信的方式获得的网站信息并将其存储在所述可信网站存储 器中, 其中所述经可信方式获得的网站信息至少包括从可信的网络站 点导入的网站信息、 手工输入的网站信息以及从对用户的搜索行为监 控中获得的网站信息中的一个或者多个。 Optionally, the search method according to the present invention further comprises using a trusted website information processor to process and store the website information obtained in a trusted manner in the trusted website storage, wherein the trusted manner The obtained website information includes at least one or more of the website information imported from the trusted web site, the website information manually input, and the website information obtained from the monitoring of the user's search behavior.
综上所述, 经由本发明所述的搜索处理器及其方法, 完美整合了 可信网站存储器 /可信网站信息处理器和现有搜索引擎, 准确搜索到诸 如官方网站的可信网站的信息, 并以显著的方式呈现给用户, 使得用 户能够更加准确地获取可靠的搜索结果。 In summary, through the search processor and the method thereof, the trusted website storage/trusted website information processor and the existing search engine are perfectly integrated, and the information of the trusted website such as the official website is accurately searched. And presented to the user in a significant way, enabling users to more reliably obtain reliable search results.
本发明的各个部件实施例可以以硬件实现, 或者以在一个或者多 个处理器上运行的软件模块实现, 或者以它们的组合实现。 本领域的 技术人员应当理解, 可以在实践中使用微处理器或者数字信号处理器 The various component embodiments of the present invention may be implemented in hardware, or in a software module running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that a microprocessor or digital signal processor can be used in practice.
( DSP )来实现根据本发明实施例的设备中的一些或者全部部件的一些 或者全部功能。 本发明还可以实现为用于执行这里所描述的方法的一 部分或者全部的设备或者装置程序 (例如, 计算机程序和计算机程序 产品) 。 这样的实现本发明的程序可以存储在计算机可读介质上, 或 者可以具有一个或者多个信号的形式。 这样的信号可以从因特网网站 上下载得到, 或者在载体信号上提供, 或者以任何其他形式提供。 (DSP) to implement some or all of the functionality of some or all of the components of the device in accordance with an embodiment of the present invention. The invention can also be implemented as a device or device program (e.g., a computer program and a computer program product) for performing some or all of the methods described herein. Such a program implementing the invention may be stored on a computer readable medium or may be in the form of one or more signals. Such signals may be downloaded from an Internet website, provided on a carrier signal, or provided in any other form.
例如, 图 3 示出了可以实现本发明方法的计算设备, 其中, 该计 算设备可以为能够实现本发明方法的客户端或服务器。 该客户端或服 务器传统上包括处理器 310和以存储器 320形式的计算机程序产品或
者计算机可读介质。 存储器 320 可以是诸如闪存、 EEPR0M (电可擦除 可编程只读存储器) 、 EPR0M、 硬盘或者 ROM之类的电子存储器。 存储 器 320具有用于执行上述方法中的任何方法步骤的程序代码 331 的存 储空间 330。例如, 用于程序代码的存储空间 330可以包括分别用于实 现上面的方法中的各种步骤的各个程序代码 331。这些程序代码可以从 一个或者多个计算机程序产品中读出或者写入到这一个或者多个计算 机程序产品中。 这些计算机程序产品包括诸如硬盘, 紧致盘 (CD ) 、 存储卡或者软盘之类的程序代码载体。 这样的计算机程序产品通常为 如参考图 4所述的便携式或者固定存储单元。 该存储单元可以具有与 图 3的客户端或服务器中的存储器 320类似布置的存储段、 存储空间 等。 程序代码可以例如以适当形式进行压缩。 通常, 存储单元包括计 算机可读代码 33Γ ,即可以由例如诸如 310之类的处理器读取的代码, 这些代码当由服务器运行时, 导致该服务器执行上面所描述的方法中 的各个步骤。 For example, Figure 3 illustrates a computing device in which the method of the present invention may be implemented, where the computing device may be a client or server capable of implementing the methods of the present invention. The client or server conventionally includes a processor 310 and a computer program product in the form of a memory 320 or Computer readable medium. The memory 320 may be an electronic memory such as a flash memory, an EEPROM (Electrically Erasable Programmable Read Only Memory), an EPROM, a hard disk, or a ROM. Memory 320 has a memory space 330 for program code 331 for performing any of the method steps described above. For example, storage space 330 for program code may include various program code 331 for implementing various steps in the above methods, respectively. The program code can be read from or written to one or more computer program products. These computer program products include program code carriers such as hard disks, compact disks (CDs), memory cards or floppy disks. Such a computer program product is typically a portable or fixed storage unit as described with reference to FIG. The storage unit may have a storage segment, a storage space, and the like arranged similarly to the storage 320 in the client or server of FIG. The program code can be compressed, for example, in an appropriate form. Typically, the storage unit includes computer readable code 33, i.e., code that can be read by a processor, such as 310, which, when executed by a server, causes the server to perform various steps in the methods described above.
本文中所称的 "一个实施例" 、 "实施例"或者 "一个或者多个 实施例" 意味着, 结合实施例描述的特定特征、 结构或者特性包括在 本发明的至少一个实施例中。此外, 请注意, 这里 "在一个实施例中" 的词语例子不一定全指同一个实施例。 "an embodiment," or "one or more embodiments" as used herein means that the particular features, structures, or characteristics described in connection with the embodiments are included in at least one embodiment of the invention. In addition, it should be noted that the phrase "in one embodiment" herein does not necessarily refer to the same embodiment.
在此处所提供的说明书中, 说明了大量具体细节。 然而, 能够理 解, 本发明的实施例可以在没有这些具体细节的情况下被实践。 在一 些实例中, 并未详细示出公知的方法、 结构和技术, 以便不模糊对本 说明书的理解。 Numerous specific details are set forth in the description provided herein. However, it is understood that the embodiments of the invention may be practiced without these specific details. In some instances, well known methods, structures, and techniques have not been shown in detail so as not to obscure the description.
应该注意的是上述实施例对本发明进行说明而不是对本发明进行 限制, 并且本领域技术人员在不脱离所附权利要求的范围的情况下可 设计出替换实施例。 在权利要求中, 不应将位于括号之间的任何参考 符号构造成对权利要求的限制。 单词 "包含" 不排除存在未列在权利 要求中的元件或步骤。 位于元件之前的单词 "一" 或 "一个" 不排除 存在多个这样的元件。 本发明可以借助于包括有若干不同元件的硬件 以及借助于适当编程的计算机来实现。 在列举了若干装置的单元权利 要求中, 这些装置中的若干个可以是通过同一个硬件项来具体体现。 单词第一、 第二、 以及第三等的使用不表示任何顺序。 可将这些单词 解释为名称。
此外, 还应当注意, 本说明书中使用的语言主要是为了可读性和 教导的目的而选择的, 而不是为了解释或者限定本发明的主题而选择 的。 因此, 在不偏离所附权利要求书的范围和精神的情况下, 对于本 技术领域的普通技术人员来说许多修改和变更都是显而易见的。 对于 本发明的范围, 对本发明所做的公开是说明性的, 而非限制性的, 本 发明的范围由所附权利要求书限定。
It is to be noted that the above-described embodiments are illustrative of the invention and are not intended to limit the scope of the invention, and those skilled in the art can devise alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as a limitation. The word "comprising" does not exclude the presence of the elements or steps that are not in the claims. The word "a" or "an" preceding a component does not exclude the presence of a plurality of such elements. The invention can be implemented by means of hardware comprising several distinct elements and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means can be embodied by the same hardware item. The use of the words first, second, and third does not indicate any order. These words can be interpreted as names. In addition, it should be noted that the language used in the specification has been selected for the purpose of readability and teaching, and is not intended to be construed or limited. Therefore, many modifications and variations will be apparent to those skilled in the art without departing from the scope of the invention. The disclosure of the present invention is intended to be illustrative, and not restrictive, and the scope of the invention is defined by the appended claims.
Claims
1、 一种搜索服务器, 包括: 1. A search server, including:
信息存储器, 适于存储从接入互联网的各网站中收集的网页信息, 所述网页信息至少包括网页的内容及其 URL; Information storage, suitable for storing web page information collected from various websites connected to the Internet, where the web page information at least includes the content of the web page and its URL;
可信网站存储器, 适于存储多个可信网站的网站信息, 所述网站 信息至少包括网站名称以及网站的网址; Trusted website storage, suitable for storing website information of multiple trusted websites, where the website information at least includes website names and website URLs;
搜索引擎, 适于接收从用户终端提交的搜索关键词, 从所述信息 存储器中检索内容包括搜索关键词的网页信息, 从所述可信网站存储 器检索网站名称与搜索关键词相对应的网站信息, 组合所检索到的网 页信息和网站信息以获得搜索结果。 A search engine adapted to receive search keywords submitted from the user terminal, retrieve web page information whose content includes the search keywords from the information storage, and retrieve website information corresponding to the website name and the search keywords from the trusted website storage. , combining the retrieved web page information and website information to obtain search results.
2、 根据权利要求 1所述的搜索服务器, 还包括可信网站信息处理 器, 适于处理经可信的方式获得的网站信息并将其存储在所述可信网 站存储器中。 2. The search server according to claim 1, further comprising a trusted website information processor, adapted to process website information obtained in a trusted manner and store it in the trusted website memory.
3、 根据权利要求 2所述的搜索服务器, 其中所述经可信方式获得 的网站信息至少包括从可信的网络站点导入的网站信息、 手工输入的 网站信息以及从对用户的搜索行为监控中获得的网站信息中的一个或 者多个。 3. The search server according to claim 2, wherein the website information obtained through a trusted method at least includes website information imported from a trusted network site, website information input manually, and website information obtained from user search behavior monitoring. One or more of the website information obtained.
4、 根据权利要求 3所述的搜索服务器, 其中所述从对用户的搜索 行为监控中获得的网站信息包括用户搜索中频繁点击的网站信息。 4. The search server according to claim 3, wherein the website information obtained from monitoring the user's search behavior includes website information that is frequently clicked in the user's search.
5、 根据权利要求 1-4中任一个所述的搜索服务器, 所述搜索引擎 包括搜索预处理器和搜索处理器; 5. The search server according to any one of claims 1-4, the search engine includes a search preprocessor and a search processor;
所述搜索预处理器适于对所述搜索关键词进行预处理,所述预处 理包括剔除掉所述搜索关键词中的常用词, 和 /或修改所述关键词中的 明显错误, 以生成有效关键词; The search preprocessor is adapted to preprocess the search keywords, and the preprocessing includes eliminating common words in the search keywords, and/or modifying obvious errors in the keywords to generate Valid keywords;
所述搜索处理器适于从所述信息存储器中检索内容包括有效关键 词的网页信息, 经由所述可信网站信息处理器从所述可信网站存储器 检索网站名称与有效关键词相对应的网站信息, 并组合所检索到的网 页信息和网站信息以获得搜索结果。 The search processor is adapted to retrieve web page information whose content includes valid keywords from the information storage, and retrieve websites whose website names correspond to the valid keywords from the trusted website storage via the trusted website information processor. information, and combines the retrieved web page information and website information to obtain search results.
6、 根据权利要求 5所述的搜索服务器, 其中所述搜索处理器组合 所检索到的网页信息和网站信息包括: 6. The search server according to claim 5, wherein the search processor combines the retrieved web page information and website information to include:
在所述可信网站信息处理器未返回所述网站信息时, 所述搜索结
果即为从所述信息存储器中检索到的所述网页信息; When the trusted website information processor does not return the website information, the search results The result is the web page information retrieved from the information storage;
在所述可信网站信息处理器返回所述网站信息时, 从所述网页信 息中删除与所述网站信息相对应的内容以获得新的网页信息, 所述搜 索结果为所述网站信息和所述新的网页信息。 When the trusted website information processor returns the website information, content corresponding to the website information is deleted from the web page information to obtain new web page information, and the search results are the website information and the website information. Describe new web page information.
7、 根据权利要求 1-6之任一所述的搜索服务器, 其中所述搜索服 务器还包括结果处理器, 所述结果处理器适于处理来自所述搜索处理 器的所述搜索结果并返回用户终端, 其中处理所述搜索结果包括: 在所述搜索结果为所述网站信息和所述新的网页信息时, 以显著 的方式处理所述网站信息。 7. The search server according to any one of claims 1 to 6, wherein the search server further includes a result processor, the result processor is adapted to process the search results from the search processor and return them to the user. The terminal, wherein processing the search results includes: when the search results are the website information and the new web page information, processing the website information in a significant manner.
8、 根据权利要求 7所述的搜索服务器, 其中, 以显著的方式处理 所述网站信息为在所述网站信息的网站名称上添加可信标志, 或将所 述网站信息放置在所述新的网页信息之前并以高亮方式或分割线区分 二者。 8. The search server according to claim 7, wherein processing the website information in a significant manner is to add a trustworthy mark to the website name of the website information, or to place the website information in the new Web page information is preceded by highlighting or dividing lines to distinguish the two.
9、 一种搜索方法, 在包括信息存储器和可信网站存储器的搜索服 务器中运行, 其中所述信息存储器适于存储从接入互联网的各网站中 收集的网页信息, 所述网页信息至少包括网页的内容及其 URL; 所述可 信网站存储器适于存储多个可信网站的网站信息, 所述网站信息至少 包括网站名称以及网站的网址, 所述方法包括如下步骤: 9. A search method, running in a search server including an information storage and a trusted website storage, wherein the information storage is adapted to store web page information collected from various websites connected to the Internet, and the web page information at least includes web pages The contents and their URLs; the trusted website storage is adapted to store website information of multiple trusted websites, the website information at least includes website names and website URLs, and the method includes the following steps:
接收从用户终端提交的搜索关键词; Receive search keywords submitted from the user terminal;
从所述信息存储器中检索内容包括搜索关键词的网页信息; 从所述可信网站存储器检索网站名称与搜索关键词相对应的网站 信串 . Retrieve web page information whose content includes search keywords from the information storage; retrieve website strings whose website names correspond to the search keywords from the trusted website storage.
组合所检索到的网页信息和网站信息以获得搜索结果。 The retrieved web page information and website information are combined to obtain search results.
10、 根据权利要求 9 所述的搜索方法, 还包括采用可信网站信息 处理器来处理经可信的方式获得的网站信息并将其存储在所述可信网 站存储器中。 10. The search method according to claim 9, further comprising using a trusted website information processor to process website information obtained in a trusted manner and store it in the trusted website memory.
11、 根据权利要求 10所述的搜索方法, 其中所述经可信方式获得 的网站信息至少包括从可信的网络站点导入的网站信息、 手工输入的 网站信息以及从对用户的搜索行为监控中获得的网站信息中的一个或 者多个。 11. The search method according to claim 10, wherein the website information obtained through a trusted method at least includes website information imported from a trusted network site, website information input manually, and website information obtained from monitoring of user search behavior. One or more of the website information obtained.
12、 根据权利要求 9-11之任一所述的搜索方法, 还包括 12. The search method according to any one of claims 9-11, further comprising
对所述搜索关键词进行预处理,所述预处理包括剔除掉所述搜索
关键词中的常用词, 和 /或修改所述关键词中的明显错误, 以生成有效 关键词; Preprocessing the search keywords, the preprocessing includes eliminating the search keywords Common words in keywords, and/or modify obvious errors in said keywords to generate effective keywords;
从所述信息存储器中检索内容包括有效关键词的网页信息, 经由 所述可信网站信息处理器从所述可信网站存储器检索网站名称与有效 关键词相对应的网站信息, 并组合所检索到的网页信息和网站信息以 获得搜索结果。 Retrieve web page information whose content includes valid keywords from the information storage, retrieve website information corresponding to website names and valid keywords from the trusted website information processor via the trusted website information processor, and combine the retrieved Web page information and website information to obtain search results.
13、 根据权利要求 12所述的搜索方法, 其中组合所检索到的网页 信息和网站信息包括: 13. The search method according to claim 12, wherein combining the retrieved web page information and website information includes:
在所述可信网站信息处理器未返回所述网站信息时, 所述搜索结 果即为从所述信息存储器中检索到的所述网页信息; When the trusted website information processor does not return the website information, the search result is the web page information retrieved from the information storage;
在所述可信网站信息处理器返回所述网站信息时, 从所述网页信 息中删除与所述网站信息相对应的内容以获得新的网页信息, 所述搜 索结果为所述网站信息和所述新的网页信息。 When the trusted website information processor returns the website information, content corresponding to the website information is deleted from the web page information to obtain new web page information, and the search results are the website information and the website information. Describe new web page information.
14、 根据权利要求 9-13之任一所述的搜索方法, 还包括处理来自 所述搜索处理器的所述搜索结果并返回用户终端; 14. The search method according to any one of claims 9-13, further comprising processing the search results from the search processor and returning them to the user terminal;
其中处理所述搜索结果包括: Processing the search results includes:
在所述搜索结果为所述网站信息和所述新的网页信息时, 以显著 的方式处理所述网站信息。 When the search results are the website information and the new web page information, the website information is processed in a significant manner.
15、 根据权利要求 14所述的搜索方法, 其中, 以显著的方式处理 所述网站信息为在所述网站信息的网站名称上添加可信标志, 或将所 述网站信息放置在所述新的网页信息之前并以高亮方式或分割线区分 二者。 15. The search method according to claim 14, wherein processing the website information in a prominent manner is to add a trustworthy mark to the website name of the website information, or to place the website information in the new website information. Web page information is preceded by highlighting or dividing lines to distinguish the two.
16、 一种计算机程序, 包括计算机可读代码, 当所述计算机可读 代码在客户端或服务器上运行时, 导致所述客户端或所述服务器执行 根据权利要求 9-15中的任一个所述的搜索方法。 16. A computer program, comprising computer-readable code, which, when run on a client or a server, causes the client or the server to execute the method according to any one of claims 9-15. search method described above.
17、 一种计算机可读介质, 其中存储了如权利要求 16所述的计算 机程序。
17. A computer-readable medium in which the computer program according to claim 16 is stored.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210395472.2 | 2012-10-17 | ||
CN2012103954722A CN102937977A (en) | 2012-10-17 | 2012-10-17 | Search server and search method |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2014059851A1 true WO2014059851A1 (en) | 2014-04-24 |
Family
ID=47696874
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2013/083925 WO2014059851A1 (en) | 2012-10-17 | 2013-09-22 | Search server and search method |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN102937977A (en) |
WO (1) | WO2014059851A1 (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102937977A (en) * | 2012-10-17 | 2013-02-20 | 北京奇虎科技有限公司 | Search server and search method |
CN103294789A (en) * | 2013-05-21 | 2013-09-11 | 鸿富锦精密工业(深圳)有限公司 | Information searching system and information searching method |
CN103353900B (en) * | 2013-07-26 | 2017-05-03 | 北京奇虎科技有限公司 | Method, device and system for accessing and certificating web address through search bar |
WO2015139500A1 (en) * | 2014-03-18 | 2015-09-24 | 北京奇虎科技有限公司 | Website analyzing and identifying method and device |
CN104572837B (en) | 2014-12-10 | 2019-07-26 | 百度在线网络技术(北京)有限公司 | The method and device of authentication information is provided on webpage |
CN108090059A (en) * | 2016-11-21 | 2018-05-29 | 百度在线网络技术(北京)有限公司 | Searching method and device |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1785895A2 (en) * | 2005-11-01 | 2007-05-16 | Lycos, Inc. | Method and system for performing a search limited to trusted web sites |
CN101827317A (en) * | 2009-09-07 | 2010-09-08 | 上海银贵网络科技服务有限公司 | Control method and controller for searching target objects via mobile terminals |
CN102375952A (en) * | 2011-10-31 | 2012-03-14 | 北龙中网(北京)科技有限责任公司 | Method for displaying whether website is credibly checked in search engine result |
CN102937977A (en) * | 2012-10-17 | 2013-02-20 | 北京奇虎科技有限公司 | Search server and search method |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101059818A (en) * | 2007-06-26 | 2007-10-24 | 申屠浩 | Method for reinforcing search engine result safety |
CN101527721B (en) * | 2009-04-22 | 2012-09-05 | 中兴通讯股份有限公司 | Anti-virus method on the basis of household gateway and device thereof |
CN101957845B (en) * | 2010-09-17 | 2011-11-23 | 百度在线网络技术(北京)有限公司 | On-line application system and implementation method thereof |
-
2012
- 2012-10-17 CN CN2012103954722A patent/CN102937977A/en active Pending
-
2013
- 2013-09-22 WO PCT/CN2013/083925 patent/WO2014059851A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1785895A2 (en) * | 2005-11-01 | 2007-05-16 | Lycos, Inc. | Method and system for performing a search limited to trusted web sites |
CN101827317A (en) * | 2009-09-07 | 2010-09-08 | 上海银贵网络科技服务有限公司 | Control method and controller for searching target objects via mobile terminals |
CN102375952A (en) * | 2011-10-31 | 2012-03-14 | 北龙中网(北京)科技有限责任公司 | Method for displaying whether website is credibly checked in search engine result |
CN102937977A (en) * | 2012-10-17 | 2013-02-20 | 北京奇虎科技有限公司 | Search server and search method |
Also Published As
Publication number | Publication date |
---|---|
CN102937977A (en) | 2013-02-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8498984B1 (en) | Categorization of search results | |
CN107103016B (en) | Method for matching image and content based on keyword representation | |
US8694493B2 (en) | Computer-implemented search using result matching | |
US9104772B2 (en) | System and method for providing tag-based relevance recommendations of bookmarks in a bookmark and tag database | |
US8661029B1 (en) | Modifying search result ranking based on implicit user feedback | |
US8799265B2 (en) | Semantically associated text index and the population and use thereof | |
US8849812B1 (en) | Generating content for topics based on user demand | |
US10025855B2 (en) | Federated community search | |
US8635203B2 (en) | Systems and methods using query patterns to disambiguate query intent | |
US8745067B2 (en) | Presenting comments from various sources | |
KR101667344B1 (en) | Method and system for providing search results | |
US9268873B2 (en) | Landing page identification, tagging and host matching for a mobile application | |
CN106415540B (en) | Federated search | |
US20150088846A1 (en) | Suggesting keywords for search engine optimization | |
US10496717B2 (en) | Storing predicted search results on a user device based on software application use | |
WO2014059851A1 (en) | Search server and search method | |
US9367638B2 (en) | Surfacing actions from social data | |
WO2015081848A1 (en) | Socialized extended search method and corresponding device and system | |
US20100057695A1 (en) | Post-processing search results on a client computer | |
WO2015081792A1 (en) | Method, device, and system for correlative and personalized extended search | |
US8799314B2 (en) | System and method for managing information map | |
WO2014059848A1 (en) | Web page search device and method | |
US11341141B2 (en) | Search system using multiple search streams | |
US20150269268A1 (en) | Search server and search method | |
CN107463590B (en) | Automatic session phase discovery |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 13848002 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 13848002 Country of ref document: EP Kind code of ref document: A1 |