CN106933902B - Data multidimensional free analysis query method and device - Google Patents
Data multidimensional free analysis query method and device Download PDFInfo
- Publication number
- CN106933902B CN106933902B CN201511032274.XA CN201511032274A CN106933902B CN 106933902 B CN106933902 B CN 106933902B CN 201511032274 A CN201511032274 A CN 201511032274A CN 106933902 B CN106933902 B CN 106933902B
- Authority
- CN
- China
- Prior art keywords
- index
- data
- dimension
- index table
- dimensions
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2453—Query optimisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/283—Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The application discloses a data multidimensional free analysis query method and device. The method comprises the following steps: respectively acquiring a data table associated with each dimension in a plurality of dimensions to obtain an associated data table set, wherein the plurality of dimensions are the dimensions required to be analyzed; determining an index table and a non-index table in the associated data table set, wherein the index table is a table containing indexes to be queried in the associated data table set, and the non-index table is a table not containing the indexes to be queried in the associated data table set; filtering the non-index table according to preset filtering conditions; connecting the index table with the filtered non-index table to obtain a data sub-table; and inquiring the index to be inquired and the analyzing dimension in the data sub-table, wherein the analyzing dimension refers to dimension analyzing of the data sub-table according to multiple dimensions. By the method and the device, the problem of low query efficiency of data multidimensional free analysis in the related technology is solved.
Description
Technical Field
The present application relates to the field of data processing, and in particular, to a query method and apparatus for multidimensional free parsing of data.
Background
In a distributed environment, a query engine commonly used at present has structured databases with metadata, such as Hive and Impala, in which data of various types of events are usually recorded in respective data tables. For example, in a "teaching management system," a teaching management database contains the following data tables: the teaching table comprises a teacher table, a course table, a score table, a student table, a class table and a teaching table, and is used for managing information of students, teachers, courses and the like in the teaching process. For another example, Session monitoring performed by the internet generally includes a Session table (Session), a page view table (PageView), an in-site search table (SiteSearch), an order table (Ecommerce), a custom Event table (Event), and many data tables representing various service scenes, but all of them are associated with each other by a Session identifier (Session id) of a client, so as to form all entities of the whole Session. When a user needs to cross a plurality of data tables from a plurality of angles to view index data and analyze dimensions, corresponding codes are written according to user query requirements in the related technology, so that related index data and related dimensions are queried in a database.
Aiming at the problem of low query efficiency of data multidimensional free analysis in the related technology, no effective solution is provided at present.
Disclosure of Invention
The present application mainly aims to provide a data multidimensional free parsing query method and apparatus, so as to solve the problem of low query efficiency in data multidimensional free parsing in the related art.
In order to achieve the above object, according to one aspect of the present application, a query method for multidimensional free parsing of data is provided. The method comprises the following steps: respectively acquiring a data table associated with each dimension in a plurality of dimensions to obtain an associated data table set, wherein the plurality of dimensions are the dimensions required to be analyzed; determining an index table and a non-index table in the associated data table set, wherein the index table is a table containing indexes to be queried in the associated data table set, and the non-index table is a table not containing the indexes to be queried in the associated data table set; filtering the non-index table according to preset filtering conditions; connecting the index table with the filtered non-index table to obtain a data sub-table; and inquiring the index to be inquired and the analyzing dimension in the data sub-table, wherein the analyzing dimension refers to dimension analyzing of the data sub-table according to multiple dimensions.
Further, after filtering the non-index table according to the preset filtering condition, before connecting the index table and the filtered non-index table to obtain the data sub-table, the method further includes: determining a connection column in the filtered non-index table, wherein the connection column is a data column needing to be connected with the index table, and connecting the index table with the filtered non-index table to obtain a data sub-table, and the data sub-table comprises: and connecting the filtered non-index table with the index table through a connecting column to obtain a data sub-table.
Further, after filtering the non-index table according to the preset filtering condition, before connecting the index table and the filtered non-index table to obtain the data sub-table, the method further includes: filtering the index table according to preset filtering conditions; determining a connection field in the filtered index table and a field related to the index to be inquired, wherein the connection field is a field needing to be connected with the filtered non-index table; taking a table formed by connecting fields in the index table after filtering processing and fields related to indexes to be inquired as a filtered index table, and connecting the index table with a filtered non-index table to obtain a data sub-table, wherein the data sub-table comprises: and connecting the filtered non-index table with the filtered index table to obtain a data sub-table.
Further, querying the index to be queried and the parsing dimension in the data sub-table comprises: selecting a dimension column to be inquired in the data sub-table, wherein the dimension column to be inquired is a dimension column to be inquired in the data sub-table; determining an index column corresponding to an index to be queried in a data sub-table; and inquiring the index to be inquired according to the index column corresponding to the index to be inquired in the data sub-table, and performing dimension analysis on the dimension column to be inquired according to multiple dimensions.
Further, respectively obtaining a data table associated with each of the plurality of dimensions, and obtaining an associated data table set includes: respectively acquiring a data table associated with each dimension in a plurality of dimensions to obtain at least one associated data table; acquiring a correlation key correlated between each correlation data table in at least one correlation data table; and associating at least one associated data table through the association key to obtain an associated data table set.
Further, respectively obtaining a data table associated with each of the plurality of dimensions, and obtaining an associated data table set includes: respectively acquiring a data table associated with each dimension in a plurality of dimensions to obtain at least one associated data table; carrying out duplicate removal filtering processing on the same data table in at least one associated data table to obtain a filtered associated data table; and taking the filtered associated data table as an associated data table set.
Further, determining the index table and the non-index table in the set of associated data tables comprises: receiving an inquiry instruction input from the outside; determining an index to be queried according to the query instruction; and dividing the associated data table set into an index table and a non-index table according to the index to be inquired.
In order to achieve the above object, according to another aspect of the present application, there is provided a query apparatus for multidimensional free parsing of data, the apparatus including: the acquisition unit is used for respectively acquiring data tables associated with each dimension in multiple dimensions to obtain an associated data table set, wherein the multiple dimensions are the dimensions required to be analyzed; the device comprises a first determining unit, a second determining unit and a third determining unit, wherein the first determining unit is used for determining an index table and a non-index table in a related data table set, the index table is a table containing indexes to be inquired in the related data table set, and the non-index table is a table not containing the indexes to be inquired in the related data table set; the processing unit is used for filtering the non-index table according to preset filtering conditions; the connection unit is used for connecting the index table and the filtered non-index table to obtain a data sub-table; and the query unit is used for querying the index to be queried and the analysis dimension in the data sub-table, wherein the analysis dimension refers to dimension analysis of the data sub-table according to multiple dimensions.
Further, the apparatus further comprises: and the second determining unit is used for determining a connection column in the filtered non-index table, wherein the connection column is a data column which needs to be connected with the index table, and the connecting unit is also used for connecting the filtered non-index table with the index table through the connection column to obtain a data sub-table.
Further, the query unit includes: the selection module is used for selecting the dimension row to be inquired in the data sub-table, and the dimension row to be inquired is the dimension row to be inquired in the data sub-table; the determining module is used for determining an index column corresponding to the index to be inquired in the data sub-table; and the query module is used for querying the indexes to be queried in the data sub-table according to the index columns corresponding to the indexes to be queried and performing dimension analysis on the dimension columns to be queried according to a plurality of dimensions.
Through the application, the following steps are adopted: respectively acquiring a data table associated with each dimension in a plurality of dimensions to obtain an associated data table set, wherein the plurality of dimensions are the dimensions required to be analyzed; determining an index table and a non-index table in the associated data table set, wherein the index table is a table containing indexes to be queried in the associated data table set, and the non-index table is a table not containing the indexes to be queried in the associated data table set; filtering the non-index table according to preset filtering conditions; connecting the index table with the filtered non-index table to obtain a data sub-table; and inquiring the index to be inquired and the analyzing dimension in the data sub-table, wherein the analyzing dimension refers to dimension analyzing of the data sub-table according to multiple dimensions, the problem that the inquiring efficiency of multi-dimension free analyzing of data in the related technology is low is solved, multi-table linkage multi-dimension free analyzing is achieved, performance overhead is reduced, and the effect of improving the efficiency of inquiring the data index is achieved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the application and, together with the description, serve to explain the application and are not intended to limit the application. In the drawings:
FIG. 1 is a flowchart of a query method for multidimensional free parsing of data according to a first embodiment of the present application;
FIG. 2 is a flowchart of a query method for multidimensional free parsing of data according to a second embodiment of the present application; and
fig. 3 is a schematic diagram of a query device for multidimensional free parsing of data according to an embodiment of the application.
Detailed Description
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only partial embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It should be understood that the data so used may be interchanged under appropriate circumstances such that embodiments of the application described herein may be used. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Interpretation of terms:
indexes are as follows: the value that can be aggregated is referred to, for example, the integrated browsing amount is an index, and aggregation is performed by using summation; the average residence time is also an index, and the polymerization is performed by using the average, wherein the polymerization operation includes summation, averaging, counting and the like.
Dimension: for example, the browser is a dimension, and Page View (Page View, abbreviated as PV) can be queried from the dimension of the browser, so that it can be known which browsers a user uses to View pages, and the number of times the pages are viewed using the browsers; the operating system is in another dimension, and the PV can also be queried from this dimension of the operating system, so that it can be known which operating systems the user uses to view pages, and the number of times pages are viewed using these operating systems.
Member (b): for example, the browser is a dimension, and the IE browser and Chrome browser are members of the dimension.
Carrying out multidimensional analysis: it means that values of multiple indexes can be parsed from multiple dimensions, for example, after a session magnitude value is viewed from an operating system dimension, after several members are selected, the session value is continuously viewed from a browser dimension, that is, the session value is parsed from two dimensions. By analogy, a free profiling can be made from multiple dimensions.
According to the embodiment of the application, a query method for multi-dimensional free analysis of data is provided.
Fig. 1 is a flowchart of a query method for multidimensional free parsing of data according to a first embodiment of the present application. As shown in fig. 1, the method comprises the steps of:
step S101, respectively obtaining a data table associated with each dimension of a plurality of dimensions to obtain an associated data table set, wherein the plurality of dimensions are dimensions required to be analyzed.
Dimension (dimension) is a structural property of a multidimensional dataset. They are an organized hierarchy (level) in a data table that describes the classification of data. These classifications and levels describe some similar set of members on which the user will perform the analysis. The dimension table may be viewed as a window through which a user analyzes data, and includes properties of fact records in the data table, some properties providing descriptive information, some properties specifying how the data of the data table is to be summarized in order to provide useful information to the analyst, and a hierarchy of properties that helps summarize the data. For example, the data table is a sales table, and the dimension table is a region table. Analyzing the sales of the commodities in a certain area means observing the sales of the commodities from the angle of the area, and analyzing the sales of the commodities in areas such as Beijing, Shanghai, Guangzhou and the like means observing the sales of the commodities from multiple angles.
The multiple dimensions in step S101 are dimensions that need to be analyzed in the query method for freely analyzing data in multiple dimensions according to the first embodiment of the present application, and in the distributed database, the data tables associated with each of the multiple dimensions are respectively obtained to obtain an associated data table set. Because some data tables may include more than two dimensions, that is, the same data table exists in the acquired data tables associated with each of the multiple dimensions, and in order to improve the accuracy of acquiring the associated data table set, deduplication processing needs to be performed, in the query method for data multidimensional free parsing in the first embodiment of the present application, the data tables associated with each of the multiple dimensions are respectively acquired, and the acquiring of the associated data table set includes: respectively acquiring a data table associated with each dimension in a plurality of dimensions to obtain at least one associated data table; carrying out duplicate removal filtering processing on the same data table in at least one associated data table to obtain a filtered associated data table; and taking the filtered associated data table as an associated data table set.
Performing duplicate removal filtering processing on the obtained associated data table to obtain a filtered associated data table; and the filtered associated data table is used as an associated data table set, so that the accuracy of acquiring the associated data table set is improved.
Optionally, in the query method for multidimensional free parsing of data according to the first embodiment of the present application, the obtaining a data table associated with each of a plurality of dimensions respectively includes: respectively acquiring a data table associated with each dimension in a plurality of dimensions to obtain at least one associated data table; acquiring a correlation key correlated between each correlation data table in at least one correlation data table; and associating at least one associated data table through the association key to obtain an associated data table set.
For example, at least one association data table is associated through a session identifier (sessionID), so as to obtain an association data table set. The specific association key is not limited in this application. If there is a data table that can only have an association key with one other data table, and only can complete the association with other data tables in the set through the other data table, the specific association key thereof is not specifically limited in this application.
Through the steps, the data table associated with each dimension in the multiple dimensions is obtained, namely the metadata information of the subsequent query index data is obtained. It should be noted that the query method for multidimensional free analysis of data provided by the present application is a query method for multidimensional free analysis of data in the analysis field.
Step S102, an index table and a non-index table in the associated data table set are determined, wherein the index table is a table containing indexes to be inquired in the associated data table set, and the non-index table is a table not containing the indexes to be inquired in the associated data table set.
The associated data table set comprises an index table and a non-index table, wherein the index table is a table containing indexes to be inquired in the associated data table set, and the non-index table is a table not containing the indexes to be inquired in the associated data table set. The index to be queried is an index that a user needs to query, for example, the index to be queried may be an access amount, a comprehensive browsing amount, an average staying time, a jumping rate, an average page access number, and the like.
Optionally, in the query method for multidimensional free parsing of data in the first embodiment of the present application, determining the index table and the non-index table in the associated data table set may further be implemented by: receiving an inquiry instruction input from the outside; determining an index to be queried according to the query instruction; and dividing the associated data table set into an index table and a non-index table according to the index to be inquired.
Through the steps, the associated data table set is divided into an index table and a non-index table.
And step S103, filtering the non-index table according to preset filtering conditions.
Filtering the non-index table according to a preset filtering condition, where the preset filtering condition in step S103 may be: all Session dimension filters and the unique dimension filters of the non-index table.
The preset filtering condition is a condition for forming the columns and filtering required to be selected by each table according to the matching dimension and the metadata of the index.
Optionally, in the query method for multidimensional free parsing of data in the first embodiment of the present application, after filtering the non-index table according to a preset filtering condition, before connecting the index table and the filtered non-index table to obtain the data sub-table, the method further includes: determining a connection column in the filtered non-index table, wherein the connection column is a data column needing to be connected with the index table, and connecting the index table with the filtered non-index table to obtain a data sub-table, and the data sub-table comprises: and connecting the filtered non-index table with the index table through a connecting column to obtain a data sub-table.
By selecting the column and selecting (distinguishment) table connection fields, such as sessionID, ecommerceid and sitesearch, distinguishment operation is performed, so that the problem that data expansion influences the efficiency of index data query when the subsequent tables are connected is avoided.
Optionally, in the query method for multidimensional free parsing of data in the first embodiment of the present application, after filtering the non-index table according to a preset filtering condition, before connecting the index table and the filtered non-index table to obtain the data sub-table, the method further includes: filtering the index table according to preset filtering conditions; determining a connection field in the filtered index table and a field related to the index to be inquired, wherein the connection field is a field needing to be connected with the filtered non-index table; taking a table formed by connecting fields in the index table after filtering processing and fields related to indexes to be inquired as a filtered index table, and connecting the index table with a filtered non-index table to obtain a data sub-table, wherein the data sub-table comprises: and connecting the filtered non-index table with the filtered index table to obtain a data sub-table.
Filtering the index table according to preset filtering conditions, wherein the preset filtering conditions can be as follows: all Session dimension filters and the unique dimension filter of the index table. By determining the table connection field and the related field of index calculation, the index table is filtered, and the problem that the efficiency of inquiring index data is influenced by data expansion when the subsequent tables are connected is avoided.
And step S104, connecting the index table with the filtered non-index table to obtain a data sub-table.
The index table and the filtered non-index table are connected in various ways, such as internal connection, external connection, cross connection, and the like. The connection query result set of the internal connection only comprises lines meeting the conditions, the internal connection is a default connection mode of the SQL Server, and the internal connection is divided into three types, namely equal connection, natural connection and unequal connection according to different comparison modes; the cross-connected connection query result set contains the combination of all rows in the two tables; the external connection query result set contains not only those rows satisfying the condition, but also all rows of a certain table, and there are 3 types of external connections: left outer connection, right outer connection and full outer connection. Multiple table queries can be implemented by join operators. The connection operation gives the user great flexibility to add new data types at any time. A new table is created for the different entities and then queried over the connection.
And connecting the index table and the filtered non-index table by any one of the above modes or other connection modes to obtain a data sub-table.
Step S105, inquiring the index to be inquired and the analyzing dimension in the data sub-table, wherein the analyzing dimension refers to dimension analyzing of the data sub-table according to multiple dimensions.
By the method comprising the steps S101 to S105, the sub-tables are defined, the Distingt selection is carried out through the table connecting fields, the distributed environment is particularly good at calculating the Distingt set, the performance is higher than that of table connection, and then the result set is subjected to table connection to obtain the data sub-tables, so that data expansion is avoided, the purpose of correctly inquiring index data is achieved, the problem of low inquiring efficiency of multi-dimensional free analysis of data in the related technology is solved, multi-table linkage is realized, the multi-dimensional free analysis is carried out, the performance cost is reduced, and the effect of improving the efficiency of inquiring the data index is achieved.
In summary, in the query method for multidimensional free parsing of data provided in the first embodiment of the present application, an associated data table set is obtained by respectively obtaining a data table associated with each of multiple dimensions, where the multiple dimensions are dimensions that need to be parsed; determining an index table and a non-index table in the associated data table set, wherein the index table is a table containing indexes to be queried in the associated data table set, and the non-index table is a table not containing the indexes to be queried in the associated data table set; filtering the non-index table according to preset filtering conditions; connecting the index table with the filtered non-index table to obtain a data sub-table; and inquiring the index to be inquired and the analyzing dimension in the data sub-table, wherein the analyzing dimension refers to dimension analyzing of the data sub-table according to multiple dimensions, the problem that the inquiring efficiency of multi-dimension free analyzing of data in the related technology is low is solved, multi-table linkage multi-dimension free analyzing is achieved, performance overhead is reduced, and the effect of improving the efficiency of inquiring the data index is achieved.
FIG. 2 is a flowchart of a query method for multidimensional free parsing of data according to a second embodiment of the present application. Fig. 2 may be considered as a preferred implementation of the embodiment shown in fig. 1. As shown in fig. 2, the method comprises the steps of:
step S201, obtaining data tables associated with multiple dimensions, respectively, to obtain an associated data table set.
Step S201 is the same as step S101, and is not described herein again.
Step S202, an index table and a non-index table in the associated data table set are determined, wherein the index table is a table containing indexes to be queried in the associated data table set, and the non-index table is a table not containing the indexes to be queried in the associated data table set.
Step S202 is the same as step S102, and is not described herein again.
And step S203, filtering the non-index table according to preset filtering conditions.
Step S203 is the same as step S103, and is not described herein again.
And step S204, connecting the index table with the filtered non-index table to obtain a data sub-table.
Step S204 is the same as step S104, and is not described herein again.
Step S205, a dimension column to be inquired is selected from the data sub-table, and the dimension column to be inquired is the dimension column to be inquired in the data sub-table.
During multidimensional analysis, an analysis chain (preset filtering condition) and related indexes on an analyzed index table are transmitted, filtering is carried out according to the analysis chain, if dimension filtering of other data tables including non-index tables is carried out, the other data tables are firstly filtered according to the related dimensions, and then the filtered result is connected with the index table (according to metadata), so that the filtering condition on the other data tables can be applied, for example, the analysis chain is as follows: whether the operating system version (window7) > > browser (chrome) > > is a new visitor (yes) > > page view (1) > > access duration. And selecting a dimension column to be queried in the data sub-table, wherein the access duration is the dimension column to be queried.
Step S206, determining an index column corresponding to the index to be inquired in the data sub-table.
And determining an index column corresponding to the index to be inquired in the data sub-table. For example, the index columns corresponding to the index to be queried are index columns of query visit amount, comprehensive browsing amount, average stay time, jump rate, average page visit number and the like.
It should be noted that the more unique values a column is selected, the slower the performance is; the longer the length of the selected column, the slower the performance.
Step S207, inquiring the index to be inquired according to the index column corresponding to the index to be inquired in the data sub-table, and performing dimension analysis on the dimension column to be inquired according to multiple dimensions.
For example, index data of index columns such as the access amount, the comprehensive browsing amount, the average staying time, the jump rate, the average page access number and the like are inquired in the data sub-table according to the index columns corresponding to the indexes to be inquired. And inquiring whether the dimension is a new visitor, a browser dimension, a page view amount and other analysis dimensions in the data sub-table according to the dimension column to be inquired.
In the query method for multidimensional free analysis of data in the second embodiment of the present application, the query columns can be related through Group By; OrderBy relevance index column; a packing operation; and selecting a GroupBy dimension column and an index column, and inquiring index data corresponding to the index to be inquired.
Through the method comprising the steps S201 to S207, the dimension column to be queried and the related index column can be selected from the data sub-table, so that multi-table linkage is performed for multi-dimension free analysis, performance overhead is reduced, and the effect of improving the efficiency of querying the data index is achieved.
In summary, in the query method for multidimensional free parsing of data provided in the second embodiment of the present application, an associated data table set is obtained by respectively obtaining a data table associated with each of multiple dimensions, where the multiple dimensions are dimensions that need to be parsed; determining an index table and a non-index table in the associated data table set, wherein the index table is a table containing indexes to be queried in the associated data table set, and the non-index table is a table not containing the indexes to be queried in the associated data table set; filtering the non-index table according to preset filtering conditions; connecting the index table with the filtered non-index table to obtain a data sub-table; selecting a dimension column to be inquired in the data sub-table, wherein the dimension column to be inquired is a dimension column to be inquired in the data sub-table; determining an index column corresponding to an index to be queried in a data sub-table; index data are inquired in the data sub-table according to the index column corresponding to the index to be inquired, and the dimension is inquired and analyzed according to the dimension column to be inquired, so that the problem of low inquiry efficiency of multi-dimensional free analysis of data in the related technology is solved, multi-table linkage is realized for multi-dimensional free analysis, performance overhead is reduced, and the effect of improving the efficiency of inquiring the data index is achieved.
It should be noted that the steps illustrated in the flowcharts of the figures may be performed in a computer system such as a set of computer-executable instructions and that, although a logical order is illustrated in the flowcharts, in some cases, the steps illustrated or described may be performed in an order different than presented herein.
The embodiment of the present application further provides an inquiry apparatus for data multidimensional free parsing, and it should be noted that the inquiry apparatus for data multidimensional free parsing in the embodiment of the present application may be used to execute the inquiry method for data multidimensional free parsing provided in the embodiment of the present application. The following describes a query device for multidimensional free parsing of data provided by an embodiment of the present application.
Fig. 3 is a schematic diagram of a query device for multidimensional free parsing of data according to an embodiment of the application. As shown in fig. 3, the apparatus includes: the device comprises an acquisition unit 10, a first determination unit 20, a processing unit 30, a connection unit 40 and a query unit 50.
The acquiring unit 10 is configured to acquire a data table associated with each of multiple dimensions, to obtain an associated data table set, where the multiple dimensions are dimensions that need to be parsed.
The first determining unit 20 is configured to determine an index table and a non-index table in the associated data table set, where the index table is a table in the associated data table set that includes an index to be queried, and the non-index table is a table in the associated data table set that does not include the index to be queried.
And the processing unit 30 is configured to perform filtering processing on the non-index table according to a preset filtering condition.
And the connecting unit 40 is used for connecting the index table and the filtered non-index table to obtain a data sub-table.
The query unit 50 is configured to query the index to be queried and the parsing dimension in the data sub-table, where the parsing dimension refers to performing dimension parsing on the data sub-table according to multiple dimensions.
According to the query device for multi-dimensional free analysis of data provided by the embodiment of the application, the data tables associated with each dimension in multiple dimensions are respectively acquired through the acquisition unit 10, so as to obtain an associated data table set, wherein the multiple dimensions are the dimensions required to be analyzed; the first determining unit 20 determines an index table and a non-index table in the associated data table set, where the index table is a table in the associated data table set that includes an index to be queried, and the non-index table is a table in the associated data table set that does not include the index to be queried; the processing unit 30 performs filtering processing on the non-index table according to a preset filtering condition; the connection unit 40 connects the index table and the filtered non-index table to obtain a data sub-table; the query unit 50 queries the index to be queried and the parsing dimension in the data sub-table, wherein the parsing dimension refers to performing dimension parsing on the data sub-table according to multiple dimensions, the problem that the query efficiency of data multi-dimension free parsing in the related technology is low is solved, multi-table linkage is achieved to perform multi-dimension free parsing, performance overhead is reduced, and the effect of improving the efficiency of querying the data index is achieved.
Optionally, in the query device for multidimensional free parsing of data provided in the embodiment of the present application, the device further includes: and the second determining unit is used for determining a connection column in the filtered non-index table, wherein the connection column is a data column which needs to be connected with the index table, and the connecting unit is also used for connecting the filtered non-index table with the index table through the connection column to obtain a data sub-table.
Optionally, in the query device for multidimensional free parsing of data provided in the embodiment of the present application, the query unit 50 includes: the selection module is used for selecting the dimension row to be inquired in the data sub-table, and the dimension row to be inquired is the dimension row to be inquired in the data sub-table; the determining module is used for determining an index column corresponding to the index to be inquired in the data sub-table; and the query module is used for querying the indexes to be queried in the data sub-table according to the index columns corresponding to the indexes to be queried and performing dimension analysis on the dimension columns to be queried according to a plurality of dimensions.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.
In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implementing, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not implemented.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
It will be apparent to those skilled in the art that the modules or steps of the present application described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and they may alternatively be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, or fabricated separately as individual integrated circuit modules, or fabricated as a single integrated circuit module from multiple modules or steps. Thus, the present application is not limited to any specific combination of hardware and software.
The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.
Claims (10)
1. A query method for multidimensional free analysis of data is characterized by comprising the following steps:
respectively acquiring a data table associated with each dimension in a plurality of dimensions to obtain an associated data table set, wherein the plurality of dimensions are the dimensions required to be analyzed;
determining an index table and a non-index table in the associated data table set, wherein the index table is a table containing indexes to be queried in the associated data table set, and the non-index table is a table not containing the indexes to be queried in the associated data table set;
filtering the non-index table according to a preset filtering condition;
connecting the index table with the filtered non-index table to obtain a data sub-table; and
inquiring indexes to be inquired and analyzing dimensions in the data sub-table, wherein the analyzing dimensions are dimension analyzing of the data sub-table according to the plurality of dimensions;
the preset filtering condition is filtering of all session dimensions in the non-index table and filtering of specific dimensions in the non-index table.
2. The method of claim 1,
after filtering the non-index table according to a preset filtering condition, before connecting the index table with the filtered non-index table to obtain a data sub-table, the method further comprises: determining a connection column in the filtered non-index table, wherein the connection column is a data column needing to be connected with the index table,
connecting the index table with the filtered non-index table to obtain a data sub-table, wherein the data sub-table comprises: and connecting the filtered non-index table with the index table through the connecting column to obtain the data sub-table.
3. The method according to claim 1 or 2,
after filtering the non-index table according to a preset filtering condition, before connecting the index table with the filtered non-index table to obtain a data sub-table, the method further comprises: filtering the index table according to the preset filtering condition; determining a connection field in the filtered index table and a field related to the index to be inquired, wherein the connection field is a field needing to be connected with the filtered non-index table; taking a table formed by the connecting fields in the index table after filtering and the fields related to the indexes to be inquired as a filtered index table,
connecting the index table with the filtered non-index table to obtain a data sub-table, wherein the data sub-table comprises: and connecting the filtered non-index table with the filtered index table to obtain the data sub-table.
4. The method of claim 1, wherein querying the data sub-table for metrics to be queried and for a parsing dimension comprises:
selecting a dimension column to be queried in the data sub-table, wherein the dimension column to be queried is the dimension column to be queried in the data sub-table;
determining an index column corresponding to the index to be queried in the data sub-table; and
inquiring the index to be inquired in the data sub-table according to the index column corresponding to the index to be inquired, and carrying out dimension analysis on the dimension column to be inquired according to the plurality of dimensions.
5. The method of claim 1, wherein obtaining the data table associated with each of the plurality of dimensions separately, and obtaining the set of associated data tables comprises:
respectively acquiring a data table associated with each dimension of the plurality of dimensions to obtain at least one associated data table;
acquiring a correlation key correlated between each correlation data table in the at least one correlation data table; and
and associating the at least one association data table through the association key to obtain the association data table set.
6. The method of claim 1, wherein obtaining the data table associated with each of the plurality of dimensions separately, and obtaining the set of associated data tables comprises:
respectively acquiring a data table associated with each dimension in a plurality of dimensions to obtain at least one associated data table;
performing duplicate removal filtering processing on the same data table in the at least one associated data table to obtain a filtered associated data table; and
and taking the filtered associated data table as the associated data table set.
7. The method of claim 1, wherein determining an index table and a non-index table in the set of associated data tables comprises:
receiving an inquiry instruction input from the outside;
determining the index to be queried according to the query instruction; and
and dividing the associated data table set into the index table and the non-index table according to the index to be queried.
8. An inquiry device for multidimensional free analysis of data, comprising:
the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for respectively acquiring data tables associated with each dimension in multiple dimensions to obtain an associated data table set, and the multiple dimensions are dimensions required to be analyzed;
a first determining unit, configured to determine an index table and a non-index table in the associated data table set, where the index table is a table in the associated data table set that includes an index to be queried, and the non-index table is a table in the associated data table set that does not include the index to be queried;
the processing unit is used for filtering the non-index table according to preset filtering conditions;
the connection unit is used for connecting the index table with the filtered non-index table to obtain a data sub-table; and
the query unit is used for querying the index to be queried and the analysis dimension in the data sub-table, wherein the analysis dimension refers to dimension analysis of the data sub-table according to the plurality of dimensions;
the preset filtering condition is filtering of all session dimensions in the non-index table and filtering of specific dimensions in the non-index table.
9. The apparatus of claim 8, further comprising: a second determining unit, configured to determine a connection column in the filtered non-index table, where the connection column is a data column that needs to be connected to the index table,
the connection unit is further configured to connect the filtered non-index table with the index table through the connection column to obtain the data sub-table.
10. The apparatus of claim 8, wherein the query unit comprises:
the selection module is used for selecting a dimension column to be queried in the data sub-table, wherein the dimension column to be queried is the dimension column to be queried in the data sub-table;
the determining module is used for determining an index column corresponding to the index to be inquired in the data sub-table; and
and the query module is used for querying the index to be queried in the data sub-table according to the index column corresponding to the index to be queried and performing dimension analysis on the dimension column to be queried according to the plurality of dimensions.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201511032274.XA CN106933902B (en) | 2015-12-31 | 2015-12-31 | Data multidimensional free analysis query method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201511032274.XA CN106933902B (en) | 2015-12-31 | 2015-12-31 | Data multidimensional free analysis query method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106933902A CN106933902A (en) | 2017-07-07 |
CN106933902B true CN106933902B (en) | 2020-02-07 |
Family
ID=59444177
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201511032274.XA Active CN106933902B (en) | 2015-12-31 | 2015-12-31 | Data multidimensional free analysis query method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106933902B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110209687B (en) * | 2018-02-23 | 2021-06-22 | 北京国双科技有限公司 | Multi-dimensional attribution query method and device |
CN109635014A (en) * | 2018-12-12 | 2019-04-16 | 苏州思必驰信息科技有限公司 | The display methods and system of user's assets information |
CN112948441B (en) * | 2021-03-26 | 2023-09-29 | 浪潮通用软件有限公司 | Multi-dimensional data collection method and equipment for financial data |
CN114579619B (en) * | 2022-04-28 | 2023-01-20 | 北京达佳互联信息技术有限公司 | Data query method and device, electronic equipment and storage medium |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103699966A (en) * | 2013-04-12 | 2014-04-02 | 国家电网公司 | Multidimensional overall-process data control system and control method based on user requirement |
CN104391951A (en) * | 2014-11-27 | 2015-03-04 | 北京国双科技有限公司 | Web page thermodynamic diagram loading method and device |
CN104392001A (en) * | 2014-12-15 | 2015-03-04 | 北京国双科技有限公司 | Database inquiry method and device |
CN104408180A (en) * | 2014-12-15 | 2015-03-11 | 北京国双科技有限公司 | Stored data inquiring method and device |
CN104408179A (en) * | 2014-12-15 | 2015-03-11 | 北京国双科技有限公司 | Method and device for processing data from data table |
CN104424229A (en) * | 2013-08-26 | 2015-03-18 | 腾讯科技(深圳)有限公司 | Calculating method and system for multi-dimensional division |
CN104484398A (en) * | 2014-12-12 | 2015-04-01 | 北京国双科技有限公司 | Method and device for aggregation of data in datasheet |
CN106933914A (en) * | 2015-12-31 | 2017-07-07 | 北京国双科技有限公司 | The data processing method and device of many tables of data |
CN106933893A (en) * | 2015-12-31 | 2017-07-07 | 北京国双科技有限公司 | The querying method and device of multi-dimensional data |
-
2015
- 2015-12-31 CN CN201511032274.XA patent/CN106933902B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103699966A (en) * | 2013-04-12 | 2014-04-02 | 国家电网公司 | Multidimensional overall-process data control system and control method based on user requirement |
CN104424229A (en) * | 2013-08-26 | 2015-03-18 | 腾讯科技(深圳)有限公司 | Calculating method and system for multi-dimensional division |
CN104391951A (en) * | 2014-11-27 | 2015-03-04 | 北京国双科技有限公司 | Web page thermodynamic diagram loading method and device |
CN104484398A (en) * | 2014-12-12 | 2015-04-01 | 北京国双科技有限公司 | Method and device for aggregation of data in datasheet |
CN104392001A (en) * | 2014-12-15 | 2015-03-04 | 北京国双科技有限公司 | Database inquiry method and device |
CN104408180A (en) * | 2014-12-15 | 2015-03-11 | 北京国双科技有限公司 | Stored data inquiring method and device |
CN104408179A (en) * | 2014-12-15 | 2015-03-11 | 北京国双科技有限公司 | Method and device for processing data from data table |
CN106933914A (en) * | 2015-12-31 | 2017-07-07 | 北京国双科技有限公司 | The data processing method and device of many tables of data |
CN106933893A (en) * | 2015-12-31 | 2017-07-07 | 北京国双科技有限公司 | The querying method and device of multi-dimensional data |
Also Published As
Publication number | Publication date |
---|---|
CN106933902A (en) | 2017-07-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11977541B2 (en) | Systems and methods for rapid data analysis | |
CN106933906B (en) | Data multi-dimensional query method and device | |
US9183529B2 (en) | Business intelligence performance analysis system | |
CN106933893B (en) | multi-dimensional data query method and device | |
US9448999B2 (en) | Method and device to detect similar documents | |
CN105653607B (en) | SQL log collection analysis method and device | |
US8676746B2 (en) | Database management system risk assessment | |
CN106933902B (en) | Data multidimensional free analysis query method and device | |
US20080222104A1 (en) | Clustered index with differentiated subfields | |
TW201214167A (en) | Matching text sets | |
CN102724059A (en) | Website operation state monitoring and abnormal detection based on MapReduce | |
US20180032603A1 (en) | Extracting graph topology from distributed databases | |
EP2843568A1 (en) | Computer implemented method for creating database structures without knowledge on functioning of relational database system | |
US20140188924A1 (en) | Techniques for ordering predicates in column partitioned databases for query optimization | |
Dividino et al. | Change-a-lod: does the schema on the linked data cloud change or not? | |
Liu et al. | Keyword search on temporal graphs | |
Saleem | Storage, indexing, query processing, and benchmarking in centralized and distributed RDF engines: a survey | |
CN106933909B (en) | Multi-dimensional data query method and device | |
MahmoudiNasab et al. | AdaptRDF: adaptive storage management for RDF databases | |
US20100268723A1 (en) | Method of partitioning a search query to gather results beyond a search limit | |
Quoc et al. | A performance study of RDF stores for linked sensor data | |
Dang-Ngoc et al. | Xlive: An xml light integration virtual engine | |
JP2014211790A (en) | Row-oriented type key value store transition design assisting system and transition design assisting method | |
Thanisch et al. | Using the entity-attribute-value model for OLAP cube construction | |
CN108304499B (en) | Method, terminal and medium for pushing down predicate in SQL connection operation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information |
Address after: 100083 No. 401, 4th Floor, Haitai Building, 229 North Fourth Ring Road, Haidian District, Beijing Applicant after: Beijing Guoshuang Technology Co.,Ltd. Address before: 100086 Cuigong Hotel, 76 Zhichun Road, Shuangyushu District, Haidian District, Beijing Applicant before: Beijing Guoshuang Technology Co.,Ltd. |
|
CB02 | Change of applicant information | ||
GR01 | Patent grant | ||
GR01 | Patent grant |