Nothing Special   »   [go: up one dir, main page]

CN107784082B - Sorting method and device - Google Patents

Sorting method and device Download PDF

Info

Publication number
CN107784082B
CN107784082B CN201710910925.3A CN201710910925A CN107784082B CN 107784082 B CN107784082 B CN 107784082B CN 201710910925 A CN201710910925 A CN 201710910925A CN 107784082 B CN107784082 B CN 107784082B
Authority
CN
China
Prior art keywords
character
sorting
sequence
characters
sequence code
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710910925.3A
Other languages
Chinese (zh)
Other versions
CN107784082A (en
Inventor
陈文成
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Neusoft Corp
Original Assignee
Neusoft Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Neusoft Corp filed Critical Neusoft Corp
Priority to CN201710910925.3A priority Critical patent/CN107784082B/en
Publication of CN107784082A publication Critical patent/CN107784082A/en
Application granted granted Critical
Publication of CN107784082B publication Critical patent/CN107784082B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • G06F16/90344Query processing by using string matching techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application provides a sorting method and a sorting device, wherein the method comprises the following steps: acquiring a plurality of character strings; inquiring a preset multinational language sequencing word stock, respectively determining character sequence codes corresponding to characters contained in the character strings, and generating sequencing sequence codes corresponding to the character strings; sequencing the character strings according to the sequencing sequence code; the multinational language sorting word stock stores the one-to-one corresponding relation between characters contained in each language and character sequence codes; the character sequence code is used for representing the arrangement sequence of each character contained in the language. According to the data sorting method and device, data sorting can be achieved rapidly and efficiently, and user experience is improved.

Description

Sorting method and device
Technical Field
The embodiment of the application relates to the technical field of computers, in particular to a sorting method and a sorting device.
Background
In the prior art, when media data on a source device is transmitted to a target device to be played, a list of the media data is often required to be displayed on the target device. If the source device is a closed data system, when the media data of the source device is acquired and the list of the media data is displayed on the target device, the list is in a disorder state, ordered display cannot be realized, a user can find the media data through complicated operations, and user experience is poor. For example, the source device supports the iAP2 protocol, and after songs on the source device are transmitted to the car audio device, a song list displayed on the car audio device cannot be automatically sorted, and does not conform to the use habit of a user, and the user cannot quickly search for songs by searching, so that the user experience is poor.
Disclosure of Invention
The embodiment of the application provides a sorting method and a sorting device, and aims to solve the technical problem that a list of media data cannot be automatically sorted in the prior art.
Therefore, the embodiment of the application provides the following technical scheme:
a first aspect of an embodiment of the present application discloses a sorting method, including: acquiring a plurality of character strings; inquiring a preset multinational language sequencing word stock, respectively determining character sequence codes corresponding to characters contained in the character strings, and generating sequencing sequence codes corresponding to the character strings; the multinational language sorting word stock stores the one-to-one corresponding relation between characters contained in each language and character sequence codes; the character sequence code is used for representing the arrangement sequence of each character contained in the language; and sequencing the character strings according to the sequencing sequence code.
In a second aspect of the embodiments of the present application, a sorting apparatus is disclosed, which includes: a first acquisition unit configured to acquire a plurality of character strings; the first query unit is used for querying a preset multinational language sequencing word stock, respectively determining character sequence codes corresponding to characters contained in the character strings, and generating sequencing sequence codes corresponding to the character strings; the multinational language sorting word stock stores the one-to-one corresponding relation between characters contained in each language and character sequence codes; the character sequence code is used for representing the arrangement sequence of each character contained in the language; and the sorting unit is used for sorting the character strings according to the sorting sequence code.
In a third aspect of the embodiments of the present application, an apparatus for sorting is disclosed, comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions for: acquiring a plurality of character strings; inquiring a preset multinational language sequencing word stock, respectively determining character sequence codes corresponding to characters contained in the character strings, and generating sequencing sequence codes corresponding to the character strings; sequencing the character strings according to the sequencing sequence code; the multinational language sorting word stock stores the one-to-one corresponding relation between characters contained in each language and character sequence codes; the character sequence code is used for representing the arrangement sequence of each character contained in the language; .
In a fourth aspect of embodiments of the present application, a machine-readable medium having stored thereon instructions, which when executed by one or more processors, cause an apparatus to perform the ordering method according to the first aspect is disclosed.
According to the sorting method and the sorting device provided by the embodiment of the application, the character sequence codes corresponding to the characters contained in the character strings to be sorted can be determined by inquiring the preset multinational language sorting word stock, the first sorting sequence codes for sorting are generated, and the character strings are sorted according to the sorting sequence codes. The data sorting method and device can quickly and efficiently realize data sorting and improve user experience.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a flow chart of a sorting method according to an embodiment of the present application;
FIG. 2 is a schematic illustration of a sequence provided in accordance with an embodiment of the present application;
FIG. 3 is a flowchart of a retrieval method according to another embodiment of the present application;
FIG. 4 is a schematic diagram of a sorting apparatus according to an embodiment of the present application;
FIG. 5 is a block diagram illustrating an apparatus for sorting according to an example embodiment.
Detailed Description
The embodiment of the application provides a sorting method and a sorting device, which can realize data sorting quickly and efficiently and improve user experience.
The terminology used in the embodiments of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the examples of this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.
The sorting method shown in the exemplary embodiment of the present application will be described with reference to fig. 1 to 3.
Referring to fig. 1, a flowchart of a sorting method according to an embodiment of the present application is provided. As shown in fig. 1, may include:
s101, a plurality of character strings are obtained.
Wherein the character strings are character strings to be sorted. The character string may be a character sequence composed of letters, Chinese characters, numbers, etc., such as song names, book names, numbers, person names, etc. For example, if song names in the song list are to be sorted, the plurality of character strings correspond to the song names.
S102, inquiring a preset multinational language sequencing word stock, respectively determining character sequence codes corresponding to characters contained in the character strings, and generating sequencing sequence codes corresponding to the character strings.
In the specific implementation of the present application, a multinational language sorting word stock may be pre-constructed, where the multinational language sorting word stock stores a one-to-one correspondence relationship between characters and character sequence codes included in each language. In specific implementation, a character sequence code corresponding to the characters one to one may be set for each character included in languages of different languages, where the character sequence code is used to indicate an arrangement order of the characters included in the languages.
For example, a multi-language word stock may be collected, and sorted word stocks of various languages may be respectively built and combined to form the multi-language sorted word stock. For example, 17 languages such as english, german, france, spanish, portuguese, italian, dutch, finnish, norwegian, swedish, danish, slovakl, polish, czech, hungary, turkish, indonesian, and the like conform to Alphabet letter sequences, and thus the 17 languages can be uniformly arranged to generate Alphabet letter sequences; for another example, russian can be arranged to generate russian letter sequences, greek can be arranged to generate greek letter sequences, and hebrew is arranged to generate hebrew letter sequences; for another example, 20000 Chinese characters can be converted into Pinyin + phonetic symbols, and then the Pinyin + phonetic symbols are sorted to generate a 26 English letter sorting sequence; for another example, tai may arrange 5600 multiple tai words commonly used to generate a tai consonant sequence; as another example, Japanese may sort through the sequence of Roman letters that generate the Japanese fifty-sound diagram. Therefore, a multinational language sorting word stock can be generated, and the sorting word stock stores the one-to-one correspondence of the characters and the character sequence codes and is used as a basis for sorting. Taking Chinese as an example, suppose that the Chinese character 'A' is the first digit in all Chinese characters according to the arrangement sequence of pinyin sequencing, and the corresponding character sequence code is 0001; suppose that the Chinese character 'hey' is the tenth in all Chinese characters according to the arrangement sequence of pinyin sequencing, and the corresponding character sequence code is 0010; assuming that the arrangement sequence of the Chinese characters 'an' in all the Chinese characters according to the pinyin sequence is the ninth ten, the corresponding character sequence codes are 0090 … …, and so on, the one-to-one corresponding relation between all the Chinese characters in the Chinese character library and the character sequence codes thereof can be generated. Of course, the foregoing is merely exemplary and is not to be construed as limiting the present application.
In specific implementation, after obtaining a plurality of character strings, before querying a preset multinational language sorting word stock, the method further comprises: judging whether a first character contained in the character string is a blank or an article; and if so, performing filtering processing on the character strings to avoid sequencing the character strings. For example, if the first word of a song title in a song list is an article or a space, the first word may not participate in the sorting according to the usage habit of the user. Based on this habit, certain words may be filtered. For example, it may be determined whether a first character in the character strings is a space, and if so, filtering may be performed to avoid sorting the character strings. If not, continuing to judge whether the first character is a letter; if not, no filtering process is performed on the string. If the first character is judged to be a letter, converting the first word into a lower case, and judging whether the word is an article; if yes, executing filtering processing; if not, the filtering process is not executed.
After a plurality of character strings to be sorted are determined, a preset multinational language sorting word stock is inquired, character sequence codes corresponding to characters contained in the character strings are respectively determined, and sorting sequence codes corresponding to the character strings are generated. For example, the languages corresponding to the character strings are determined, a sorting word stock corresponding to the current language is searched in a multinational language sorting word stock, and the character sequence code of the character string is searched in the sorting word stock through a binary algorithm. When determining the language corresponding to the current character string, the language may be determined according to system settings or user selections.
In a possible implementation manner, the querying a preset multinational language sorting word stock, respectively determining character sequence codes corresponding to characters included in the plurality of character strings, and generating the sorting sequence codes corresponding to the character strings includes: and acquiring all or part of characters of the character string as characters to be sorted, inquiring a preset multinational language sorting word stock, determining character sequence codes corresponding to the characters, and combining the character sequence codes into a sorting sequence code. For example, all characters in the character string may be used as characters to be sorted, and the character sequence codes corresponding to the respective characters may be combined into a sorting sequence code. Of course, a part of characters of the character string may also be acquired as the characters to be sorted. It should be noted that, in this implementation manner, the length of the partial character may be set empirically, not only considering that the generated sorting sequence code has one-to-one corresponding storage locations in the hash table so as not to cause a collision, but also considering a certain calculation efficiency. Correspondingly, when the plurality of character strings are sorted according to the sorting sequence code, the storage position corresponding to the sorting sequence code may be determined according to the corresponding relationship between the sorting sequence code and the storage position of the hash table, and the character string corresponding to the sorting sequence code may be stored in the storage position. It should be noted that the sorting sequence code corresponds to the hash table storage location one to one.
In a possible implementation manner, the querying a preset multinational language sorting word stock, respectively determining character sequence codes corresponding to characters included in the plurality of character strings, and generating the sorting sequence codes corresponding to the character strings includes: acquiring the first N characters of the character string as characters to be sequenced, inquiring a preset multi-language sequencing word library, determining character sequence codes corresponding to the characters to be sequenced, and combining the character sequence codes into a first sequencing sequence code; and N is less than the number M of characters contained in the character string. . Wherein N and M are integers. For example, the first 4 characters in the character string may be used as the characters to be sorted, and the character sequence codes of the first 4 characters may be obtained and combined to form the sorting sequence code SortCode. Fig. 2 is a schematic diagram of a sequence provided in an embodiment of the present application. Assuming that the obtained character strings for sorting are ABC, AAAAB, AAAA, and AAAAABC, respectively, matching character sequence codes are assigned to the first 4 characters of each character string, and the matching character sequence codes are combined to generate a sorted sequence code, as shown in table 1 below. Corresponding to the implementation of table 2, the implementation of table 1 reduces the complexity of calculation to some extent, but may cause the occurrence of collision of the ordered sequence codes.
TABLE 1
Character string First ordered sequence code A
ABC
0001 0051 0067 0000
AAAAB 0001 0001 0001 0001
AAAA 0001 0001 0001 0001
AAAAABC 0001 0001 0001 0001
It should be noted that table 1 is only an exemplary description, and all characters may be converted into a character sequence code, and a sorted sequence code of the character string may be generated by combination, as shown in table 2. Note that the lengths of the characters of the next character strings may be compared in advance, and the length of the first permuted serial code may be determined by the length of the character string having the longest length, for example, the length of the character string having the longest length in table 2 is AAAABC and the length thereof is 6, so that the length of the first permuted serial code may be set to 4 × 6 — 24. Of course, the length of the first ordered sequence code may be set to a fixed value. The implementation of table 2 reduces the occurrence of ordering sequence code collisions relative to table 1.
TABLE 2
Character string First ordered sequence code B
ABC
0001 0051 0067 0000 0000 0000
AAAAB 0001 0001 0001 0001 0051 0000
AAAA 0001 0001 0001 0001 0000 0000
AAAABC 0001 0001 0001 0001 0051 0067
S103, sequencing the character strings according to the sequencing sequence code.
In a possible implementation manner, if all or part of characters of the character strings are acquired as characters to be sorted, a preset multi-language sorting word stock is inquired, character sequence codes corresponding to the characters are determined and combined into sorting sequence codes, and then sorting processing is performed on the character strings according to the sorting sequence codes, wherein the step of determining storage positions corresponding to the sorting sequence codes according to the corresponding relation between the sorting sequence codes and storage positions of a hash table and storing the character strings corresponding to the sorting sequence codes in the storage positions is carried out.
In another possible implementation manner, if the first N characters of the character string are obtained as the characters to be sorted, querying a preset multi-language sorting word library, determining a character sequence code corresponding to the characters to be sorted, and after combining the character sequence codes into a first sorting sequence code, the sorting the character strings according to the sorting sequence code includes: sorting the character strings into an ordered linked list according to a first sorting sequence code; if the ordered linked list is determined to have a conflict linked list, generating a second sequencing sequence code; wherein the first and second ordered sequence codes are different; and sorting the character strings in the conflict linked list into an ordered linked list according to a second sorting sequence code. In specific implementation, a plurality of character strings, for example, the whole song list, can be sorted into an ordered linked list according to the first sorting sequence code by the Hash algorithm. If the linked list is in conflict during sorting, 4 characters to be sorted are taken from the Hash linked list with the conflict and are converted into character sequence codes to be combined into a second sorting sequence code, and the Hash linked list with the conflict is sorted into an ordered linked list according to the second sorting sequence code through a Hash algorithm. This process is iterated until all the conflicting Hash chain tables are correctly ordered.
As described in detail below, in one possible implementation, the sorting the character strings into the ordered list according to the first sorting sequence code includes: determining a storage position corresponding to a first sorting sequence code according to the corresponding relation between the first sorting sequence code and the storage position of a hash table; judging whether the storage position is empty or not; if the storage position is empty, storing the character string corresponding to the first sequencing serial code in the storage position; if the storage position is not empty, setting a conflict linked list at an entrance of the storage position; and storing the character string corresponding to the first sorting sequence code in the conflict linked list. For example, the sorting sequence code may be set to have an unsigned long integer structure, and there may be hash tables with different hash ranges according to the selected language. For example, the hash range for english is 0x0001000000000000 to 0x05CF05 CF. The storage positions in the hash table have a one-to-one correspondence with the sorting sequence codes. During sorting, the storage position of the hash table can be determined through the first sorting sequence code, whether the position of the hash table is empty or not is checked, if the position of the hash table is empty, the character string corresponding to the first sorting sequence code is stored, if the position of the hash table is not empty, a linked list needs to be arranged at the position entrance of the hash table, and the character string corresponding to the first sorting sequence code is stored in the conflict linked list so as to solve the situation of conflict hash.
Further, if the ordered linked list is determined to have a conflict linked list, generating a second sequencing sequence code; wherein the first and second ordered sequence codes are different; and sorting the character strings in the conflict linked list into a sequential chain according to a second sorting sequence code. Wherein the generating the second ordered sequence code comprises: if the conflict linked list is determined to exist, acquiring each character string contained in the conflict linked list; acquiring part or all of characters of each character string except the first N characters as characters to be ordered, determining character sequence codes corresponding to the characters to be ordered, and combining the character sequence codes into a second ordered sequence code. For example, assuming that the character string is aaaaaabaaaa, first 4 characters are obtained as characters to be sorted, and the generated first sorting sequence code is 0001000100010001; if the character string is determined to exist in the conflict linked list, the characters of the character string except the first 4 characters are obtained as the characters to be sorted, and a second character sequence code 0051006700010001 is generated. For example, whether a hash sequence in conflict exists is determined by whether a linked list is mounted at a hash table entry, if so, a second sorting sequence code SortCode2 of all linked list units in the linked list is obtained, a storage position corresponding to the second sorting sequence code is determined according to the second sorting sequence code, whether the storage position is empty is judged, and if the storage position is empty, the character string is stored in the storage position; if not, setting a conflict linked list at the entry of the storage position, and storing the character strings corresponding to the second sorting sequence code in the conflict linked list … … until there is no conflicting sequence. As shown in fig. 2, after hash ordering is performed on the character strings ABC, aaaaaab, AAAA, and aaaaaaabc according to the first ordering sequence code, aaaaaab, AAAA, and aaaaaaabc in the linked list are arranged in front of the character string ABC, but the ordering and storage of aaaaaab, AAAA, and AAAABC conflict, and the conflicting hash sequences are continuously ordered. For example, the second-rank-ordered sequence codes of aaaaaab, AAAA, and aaaaaabc are 0051000000000000, 0000000000000000, and 0051006700000000, respectively, and are ranked as AAAA, AAAAB, and AAAABC according to the second-rank-ordered sequence codes. The final ordering of these 4 strings is: AAAA, AAAAB, AAAABC, ABC.
Further, the method and the device can also realize quick retrieval of the character strings. Fig. 3 is a flowchart of a retrieval method according to another embodiment of the present application.
S301, dividing the character strings into a plurality of subsets according to the character sequence codes corresponding to the first characters of the character strings, wherein each subset corresponds to one character.
S302, setting the corresponding relation between the character sequence code of the character corresponding to the subset and the retrieval index code.
S303, receiving a retrieval request containing the retrieval index code.
S304, according to the corresponding relation between the retrieval index code and the sequencing sequence code, obtaining a subset corresponding to the character sequence code, and displaying each character string contained in the subset.
The following description will take an example in which a plurality of character strings are sorted into an ordered linked list, and a character string subset is a sublink list. For example, the ordered linked list may be split into linked lists respectively distinguished by first letters according to the character sequence code of the first character, and the character sequence code may be associated with the first-letter retrieval index code. Referring to table 3, intentions are correspondingly expressed for english initials, character sequence codes, and search index codes. When the search is needed, the index code is directly jumped to the corresponding first letter linked list through first letter search, when the first letter linked list does not exist, the condition of the next first letter is continuously searched until the search is successful, and if the end of the linked list is reached and the search is still not successful, the position of the current linked list is kept.
TABLE 3 initial, character sequence code, index code corresponding relation table
First letter Character sequence code Index code for retrieval
A
0001 0
B 0051 1
C 0067 2
D 008D 3
E 00B6 4
F 0100 5
G 0114 6
H 0139 7
I 015E 8
J 018F 9
K 019D 10
L 01C3 11
M 01EE 12
N 021E 13
O 0249 14
P 029A 15
Q 02B4 16
R 02BD 17
S 02EC 18
T 0313 19
U 0337 20
V 037D 21
W 038C 22
X 03A4 23
Y 03B0 24
Z 03CD 25
0 03D1 26
In a specific implementation of the present application, song ordering and retrieval functions in english, german, french, spanish, portuguese, italian, dutch, finnish, norway, swedish, danish, slovak, polish, czech, hungary, turkish, indonesian, russian, greek, hebrew, chinese, thai, and japanese under the iAP2 protocol may be supported. The authentication requirement of the project iPod can be met, the normal voice sequence habit of the user is met, and the user experience effect is improved. The song list can be sorted quickly and efficiently. As shown in table 4, a table is prepared for the number of songs in the actual environment and the consumption time.
TABLE 4 comparison of song count to elapsed time
Figure BDA0001424944010000101
As shown in table 4, when the number of songs is 100, the time consumed for sorting the song list of the Linux system by using the conventional method is 1.133S, the time consumed for sorting by using the method of the present application is 0.983S, and the time consumed is 86% of that of the conventional method; when the number of songs is 500, the time consumed for sequencing the song list of the Linux system by using the traditional method is 6.457, the time consumed for sequencing by using the method of the application is 2.436, and the consumed time is only 27.3 percent of that of the traditional method; … … when the number of songs is 15000, the time consumed for sorting the song list of the Linux system by using the traditional method is 813.741S, the time consumed for sorting by using the method of the present application is 61.731, the time consumption is only 7.5% of the time consumption of the traditional method, the time consumption is reduced, and the processing speed is improved. As can be seen from Table 4, compared with the conventional method, the song sorting time is greatly reduced, and the rapid and accurate sorting is realized.
The following describes a device corresponding to the method provided by the embodiment of the present application.
Referring to fig. 4, a schematic diagram of a sorting apparatus according to an embodiment of the present application is shown.
A sorting apparatus 400 comprising:
a first obtaining unit 401, configured to obtain a plurality of character strings. The specific implementation of the first obtaining unit 401 may be implemented with reference to S101 in the embodiment shown in fig. 1.
A first querying unit 402, configured to query a preset multinational language sorting word stock, determine character sequence codes corresponding to characters included in the multiple character strings, respectively, and generate sorting sequence codes corresponding to the character strings. The multinational language sorting word stock stores the one-to-one corresponding relation between characters contained in each language and character sequence codes; the character sequence code is used for representing the arrangement sequence of each character contained in the language. The specific implementation of the querying unit 402 can be implemented with reference to S102 in the embodiment shown in fig. 1.
A sorting unit 403, configured to perform sorting processing on the multiple character strings according to the sorting sequence code. The specific implementation of the sorting unit 403 may be implemented with reference to S103 in the embodiment shown in fig. 1.
In some embodiments, the query unit specifically includes:
the first character acquisition unit to be sorted is used for acquiring all or part of characters of the character string as characters to be sorted;
the first character sequence code determining unit is used for inquiring a preset multinational language sequencing word stock and determining a character sequence code corresponding to each character;
and the first combination unit is used for combining the character sequence codes into the sequencing sequence codes.
In some embodiments, the sorting unit specifically includes:
and the first sequencing unit is used for determining a storage position corresponding to the sequencing sequence code according to the corresponding relation between the sequencing sequence code and the storage position of the hash table, and storing the character string corresponding to the sequencing sequence code in the storage position.
In some embodiments, the query unit specifically includes:
the second character acquisition unit to be sorted is used for acquiring the first N characters of the character string as characters to be sorted; and N is less than the number M of characters contained in the character string.
The second character sequence code determining unit is used for inquiring a preset multinational language sorting word stock and determining the character sequence code corresponding to each character to be sorted;
and the second combination unit is used for combining the character sequence codes into the first sequence code.
In some embodiments, the sorting unit specifically includes:
the second sorting unit sorts the character strings into an ordered linked list according to the first sorting sequence code;
the generating unit is used for generating a second sequencing sequence code if the sequential linked list is determined to have the conflict linked list; wherein the first and second ordered sequence codes are different;
and the third sorting unit is used for sorting the character strings in the conflict linked list into an ordered linked list according to the second sorting sequence code.
Wherein the second sorting unit includes:
the position determining unit is used for determining a storage position corresponding to the first sorting sequence code according to the corresponding relation between the sorting sequence code and the storage position of the hash table;
a judging unit for judging whether the storage location is empty;
the first storage unit is used for storing the character string corresponding to the first sequencing serial code in the storage position if the storage position is empty;
the setting unit is used for setting a conflict linked list at an entrance of the storage position if the storage position is not empty;
and the second storage unit is used for storing the character strings corresponding to the first sorting serial codes in the conflict linked list.
In some embodiments, the generating unit is specifically configured to:
if the conflict linked list is determined to exist, acquiring each character string contained in the conflict linked list; acquiring part or all of characters of each character string except the first N characters as characters to be ordered, determining character sequence codes corresponding to the characters to be ordered, and combining the character sequence codes into a second ordered sequence code.
In some embodiments, the apparatus further comprises:
the retrieval index code setting unit is used for dividing the character strings into a plurality of subsets according to the character sequence codes corresponding to the first characters of the character strings, and each subset corresponds to one character; and setting the corresponding relation between the character sequence code of the character corresponding to the subset and the retrieval index code.
In some embodiments, the apparatus further comprises:
and the retrieval unit is used for receiving a retrieval request containing the retrieval index code, acquiring a subset corresponding to the character sequence code according to the corresponding relation between the retrieval index code and the character sequence code, and displaying each character string contained in the subset.
In some embodiments, the apparatus further comprises:
the filtering unit is used for judging whether a first character contained in the character string is a blank or an article or not after the character strings are obtained and before a preset multinational language sequencing word stock is inquired; and if so, performing filtering processing on the character strings to avoid sequencing the character strings.
Referring to fig. 5, a block diagram of an apparatus for sorting according to another embodiment of the present application is provided. The method comprises the following steps: at least one processor 501 (e.g., CPU), memory 502 and at least one communication bus 503 for enabling communications among the devices. The processor 501 is arranged to execute executable modules, such as computer programs, stored in the memory 502. The Memory 502 may comprise a high-speed Random Access Memory (RAM) and may also include a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. One or more programs are stored in the memory and configured to be executed by the one or more processors 501 include instructions for: acquiring a plurality of character strings; inquiring a preset multinational language sequencing word stock, respectively determining character sequence codes corresponding to characters contained in the character strings, and generating sequencing sequence codes corresponding to the character strings; the multinational language sorting word stock stores the one-to-one corresponding relation between characters contained in each language and character sequence codes; the character sequence code is used for representing the arrangement sequence of each character contained in the language; and sequencing the character strings according to the sequencing sequence code.
In some embodiments, processor 501 is specifically configured to execute the one or more programs including instructions for: acquiring all or part of characters of the character string as characters to be sorted, inquiring a preset multinational language sorting word stock, determining character sequence codes corresponding to the characters, and combining the character sequence codes into a sorting sequence code; determining a storage position corresponding to the sorting sequence code according to the corresponding relation between the sorting sequence code and the storage position of the hash table, and storing the character string corresponding to the sorting sequence code in the storage position.
In some embodiments, processor 501 is specifically configured to execute the one or more programs including instructions for: acquiring the first N characters of the character string as characters to be sequenced, inquiring a preset multi-language sequencing word library, determining character sequence codes corresponding to the characters to be sequenced, and combining the character sequence codes into a first sequencing sequence code; and N is less than the number M of characters contained in the character string.
In some embodiments, processor 501 is specifically configured to execute the one or more programs including instructions for: sorting the character strings into an ordered linked list according to a first sorting sequence code; if the ordered linked list is determined to have a conflict linked list, generating a second sequencing sequence code; wherein the first and second ordered sequence codes are different; and sorting the character strings in the conflict linked list into an ordered linked list according to a second sorting sequence code.
In some embodiments, processor 501 is specifically configured to execute the one or more programs including instructions for: determining a storage position corresponding to a first sorting sequence code according to the corresponding relation between the first sorting sequence code and the storage position of a hash table; judging whether the storage position is empty or not; if the storage position is empty, storing the character string corresponding to the first sequencing serial code in the storage position; if the storage position is not empty, setting a conflict linked list at an entrance of the storage position; and storing the character string corresponding to the first sorting sequence code in the conflict linked list.
In some embodiments, processor 501 is specifically configured to execute the one or more programs including instructions for: if the conflict linked list is determined to exist, acquiring each character string contained in the conflict linked list; acquiring part or all of characters of each character string except the first N characters as characters to be ordered, determining character sequence codes corresponding to the characters to be ordered, and combining the character sequence codes into a second ordered sequence code.
In some embodiments, processor 501 is specifically configured to execute the one or more programs including instructions for: dividing the character strings into a plurality of subsets according to character sequence codes corresponding to first characters of the character strings, wherein each subset corresponds to one character; and setting the corresponding relation between the character sequence code of the character corresponding to the subset and the retrieval index code.
In some embodiments, processor 501 is specifically configured to execute the one or more programs including instructions for: receiving a retrieval request containing the retrieval index code, acquiring a subset corresponding to the character sequence code according to the corresponding relation between the retrieval index code and the character sequence code, and displaying each character string contained in the subset.
In some embodiments, processor 501 is specifically configured to execute the one or more programs including instructions for: judging whether a first character contained in the character string is a blank or an article; and if so, performing filtering processing on the character strings to avoid sequencing the character strings.
In an exemplary embodiment, a non-transitory computer-readable storage medium comprising instructions, such as a memory comprising instructions, executable by a processor of an apparatus to perform the above-described method is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
A machine-readable medium, which may be, for example, a non-transitory computer-readable storage medium, in which instructions, when executed by a processor of an apparatus (terminal or server), enable the apparatus to perform a method of ranking, the method comprising: acquiring a plurality of character strings; inquiring a preset multinational language sequencing word stock, respectively determining character sequence codes corresponding to characters contained in the character strings, and generating sequencing sequence codes corresponding to the character strings; the multinational language sorting word stock stores the one-to-one corresponding relation between characters contained in each language and character sequence codes; the character sequence code is used for representing the arrangement sequence of each character contained in the language; and sequencing the character strings according to the sequencing sequence code.
The arrangement of each unit or module of the apparatus of the present application can be implemented by referring to the methods shown in fig. 1 to 3, which are not described herein again.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice in the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It will be understood that the present application is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the application is limited only by the attached claims
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus embodiment, since it is substantially similar to the method embodiment, it is relatively simple to describe, and reference may be made to some descriptions of the method embodiment for relevant points. The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort. The foregoing is directed to embodiments of the present application and it is noted that numerous modifications and adaptations may be made by those skilled in the art without departing from the principles of the present application and are intended to be within the scope of the present application.

Claims (7)

1. A method of data ordering, comprising:
acquiring a plurality of character strings;
acquiring the first N characters of the character string as characters to be sequenced, inquiring a preset multi-language sequencing word library, determining character sequence codes corresponding to the characters to be sequenced, and combining the character sequence codes into a first sequencing sequence code; wherein N is less than the number M of characters contained in the character string; the multinational language sorting word stock stores the one-to-one corresponding relation between characters contained in each language and character sequence codes; the character sequence code is used for representing the arrangement sequence of each character contained in the language;
sorting the character strings into an ordered linked list according to a first sorting sequence code;
if the ordered linked list is determined to have a conflict linked list, generating a second sequencing sequence code; wherein the first and second ordered sequence codes are different;
and sorting the character strings in the conflict linked list into an ordered linked list according to a second sorting sequence code.
2. The method of claim 1, wherein the sorting the plurality of strings into an ordered linked list according to a first ordering sequence code comprises:
determining a storage position corresponding to a first sorting sequence code according to the corresponding relation between the first sorting sequence code and the storage position of a hash table;
judging whether the storage position is empty or not;
if the storage position is empty, storing the character string corresponding to the first sequencing serial code in the storage position;
if the storage position is not empty, setting a conflict linked list at an entrance of the storage position;
and storing the character string corresponding to the first sorting sequence code in the conflict linked list.
3. The method of claim 1, further comprising:
dividing the character strings into a plurality of subsets according to character sequence codes corresponding to first characters of the character strings, wherein each subset corresponds to one character;
and setting the corresponding relation between the character sequence code of the character corresponding to the subset and the retrieval index code.
4. The method of claim 3, further comprising:
receiving a retrieval request containing the retrieval index code, acquiring a subset corresponding to the character sequence code according to the corresponding relation between the retrieval index code and the character sequence code, and displaying each character string contained in the subset.
5. A sequencing apparatus, comprising:
a first acquisition unit configured to acquire a plurality of character strings;
the first query unit is used for acquiring the first N characters of the character string as characters to be sequenced, querying a preset multi-language sequencing word stock, determining character sequence codes corresponding to the characters to be sequenced, and combining the character sequence codes into a first sequencing sequence code; wherein N is less than the number M of characters contained in the character string; the multinational language sorting word stock stores the one-to-one corresponding relation between characters contained in each language and character sequence codes; the character sequence code is used for representing the arrangement sequence of each character contained in the language;
the sorting unit is used for sorting the character strings into an ordered linked list according to a first sorting sequence code; if the ordered linked list is determined to have a conflict linked list, generating a second sequencing sequence code; wherein the first and second ordered sequence codes are different; and sorting the character strings in the conflict linked list into an ordered linked list according to a second sorting sequence code.
6. An apparatus for sorting, comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory, and wherein execution of the one or more programs by one or more processors comprises instructions for:
acquiring a plurality of character strings;
acquiring the first N characters of the character string as characters to be sequenced, inquiring a preset multi-language sequencing word library, determining character sequence codes corresponding to the characters to be sequenced, and combining the character sequence codes into a first sequencing sequence code; wherein N is less than the number M of characters contained in the character string; the multinational language sorting word stock stores the one-to-one corresponding relation between characters contained in each language and character sequence codes; the character sequence code is used for representing the arrangement sequence of each character contained in the language;
sorting the character strings into an ordered linked list according to a first sorting sequence code;
if the ordered linked list is determined to have a conflict linked list, generating a second sequencing sequence code; wherein the first and second ordered sequence codes are different;
and sorting the character strings in the conflict linked list into an ordered linked list according to a second sorting sequence code.
7. A machine-readable medium having stored thereon instructions, which when executed by one or more processors, cause an apparatus to perform a method of sorting as recited in one or more of claims 1-4.
CN201710910925.3A 2017-09-29 2017-09-29 Sorting method and device Active CN107784082B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710910925.3A CN107784082B (en) 2017-09-29 2017-09-29 Sorting method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710910925.3A CN107784082B (en) 2017-09-29 2017-09-29 Sorting method and device

Publications (2)

Publication Number Publication Date
CN107784082A CN107784082A (en) 2018-03-09
CN107784082B true CN107784082B (en) 2020-09-25

Family

ID=61433666

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710910925.3A Active CN107784082B (en) 2017-09-29 2017-09-29 Sorting method and device

Country Status (1)

Country Link
CN (1) CN107784082B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112241558B (en) * 2020-09-03 2024-08-30 深圳市华阳国际工程设计股份有限公司 Unified method and device for element type names and computer storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101213542A (en) * 2005-06-30 2008-07-02 索尼株式会社 Information processor, information processing method and information processing program
CN102981607A (en) * 2011-06-16 2013-03-20 Gn奈康有限公司 Computer-implemented method of arranging text items in a predefined order
CN103514160A (en) * 2012-06-15 2014-01-15 华为终端有限公司 Sorting method and mobile equipment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101213542A (en) * 2005-06-30 2008-07-02 索尼株式会社 Information processor, information processing method and information processing program
CN102981607A (en) * 2011-06-16 2013-03-20 Gn奈康有限公司 Computer-implemented method of arranging text items in a predefined order
CN103514160A (en) * 2012-06-15 2014-01-15 华为终端有限公司 Sorting method and mobile equipment

Also Published As

Publication number Publication date
CN107784082A (en) 2018-03-09

Similar Documents

Publication Publication Date Title
JP6258191B2 (en) Input method and system
KR101265263B1 (en) Method and system for name matching using phonetic sign and computer readable medium recording the method
CN106682169B (en) Application label mining method and device, application searching method and server
US8914275B2 (en) Text prediction
CN105183761B (en) Sensitive word replacing method and device
US20180260475A1 (en) Systems and methods for verbatim-text mining
JP5449628B2 (en) Determining category information using multistage
KR101326354B1 (en) Transliteration device, recording medium, and method
CN110941959A (en) Text violation detection method, text restoration method, data processing method and data processing equipment
EP3994586A1 (en) Revealing content reuse using fine analysis
CN114021577A (en) Content tag generation method and device, electronic equipment and storage medium
CN111198936A (en) Voice search method and device, electronic equipment and storage medium
CN107784082B (en) Sorting method and device
CN112380445B (en) Data query method, device, equipment and storage medium
CN107679122B (en) Fuzzy search method and terminal
CN106294784B (en) resource searching method and device
US20140129494A1 (en) Searching text via function learning
CN107229349B (en) Character display method and device of input method
JP2009205499A (en) Web page specification apparatus, web page specification method, and program for specifying web page
CN114296561A (en) User word bank obtaining method and candidate word generating method and device
CN110297825B (en) Data processing method, device, computer equipment and storage medium
CN111831876B (en) Query method, device and storage medium
CN113704384A (en) Method and device for generating code through voice recognition, electronic equipment and storage medium
CN107621892B (en) Method and device for acquiring information
CN105653713A (en) Method and device for determining existence of equipment identification codes

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant