Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail and completely with reference to the following specific embodiments of the present application and the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In order to change the condition that the existing method for obtaining the data bypass is low in efficiency, the embodiment of the specification provides a new method, a new device and new equipment for obtaining the data bypass, and the method and the device can be applied to construction equipment of the data bypass.
It should be noted that the method for obtaining data bypass provided by the embodiments of the present specification may be used to create data bypass required by data warehouse analysis, and may also be used to create data bypass in other application scenarios.
A method for obtaining data bypass proposed in the embodiments of the present specification will be described below.
As shown in fig. 1, a method for obtaining data bypass provided by an embodiment of the present specification may include:
and 102, respectively generating soft links aiming at the scheduling files corresponding to the plurality of task nodes and the plurality of task files to obtain the plurality of soft links, wherein the plurality of task nodes are task nodes contained in a target data link of the data bypass to be created.
The target data link may be a data calculation process for a data warehouse.
The target data link for which the data bypass is to be created generally includes a plurality of task nodes, where the plurality of task nodes includes an initial task node, an end task node, and several other task nodes located between the initial task node and the end task node, and there is an interdependency relationship (or a blood-related relationship) between the plurality of task nodes. Generally, the plurality of Task nodes are scheduled to run by at least one schedule File (Task Rel File), and one Task node also has at least one corresponding Task File (Task File), and when the at least one Task File is called, the Task to be implemented by the Task node can be implemented.
Optionally, before step 102, the plurality of task nodes included in the target data link and the dependency relationship between the plurality of task nodes may also be determined; and then determining a scheduling file and a plurality of task files corresponding to the plurality of task nodes according to the dependency relationship.
A Soft Link (Soft Link), also known as a symbolic Link, is a path for accessing the files pointed to by the Soft Link.
Optionally, step 102 further comprises: and saving the plurality of soft links to a first preset file.
Specifically, as shown in fig. 2, when Soft links are respectively generated for the schedule File and the Task files corresponding to the Task nodes in step 102, the Task files 21[ including Task File 1(Task File1), Task File 2(Task File2), Task File 3(Task File3), Task File 4(Task File4), ·, Task File N (Task File) and schedule File (Task File) ] corresponding to the Task nodes may be input into the Soft link generator 22, and a plurality of Soft links [ including Soft link to Task File1 of Task File1, Soft link to Task File2 of Task File2, Soft link to Task File3 (Soft link to File3), Soft link to Task File4 (Soft link to Task File4, Soft link to Task File N (Task File1), and Task File1, Task File N (Task File1) may be stored in the first Soft link to schedule File1 In a preset file 23. The first preset file may be a file in an xml format, and the file name of the first preset file may be Soft link.
And 104, calling the soft links to read the files pointed by the soft links.
When the plurality of soft links are saved in the first pre-set file, step 104 may include: and calling the soft links from the first preset file to read files pointed by the soft links.
Since the soft link is a path for accessing the file pointed to by the soft link, the files pointed to by the soft links can be read by accessing the soft links.
And 106, analyzing the codes of the files, acquiring a plurality of data table identifications corresponding to the task nodes, and generating a plurality of bypass table identifications correspondingly according to the data table identifications.
First, in step 106, according to the dependency relationship among the task nodes, the codes of the files may be sequentially analyzed from the file pointed by the start task node to the file pointed by the end task node, so as to obtain a plurality of data table identifiers corresponding to the task nodes.
For example, as shown in fig. 3, if all tasks are executed in the order of numbers from the beginning of the task node to the end of the task node, the soft link of the schedule file is called to sequentially call the soft link of the task file1, the soft link of the task file2, ·, and the soft link of the task file N, so that the task file1, the task file2, ·, and the task file N are run, and the codes of the task file1, the task file2, ·, and the task file N are analyzed to obtain the data tables [ including the input (input) data table and the output (output) data table ] corresponding to each task node, and obtain the data table list 24, where the data table list 24 includes table 1(table1), table 2(table2), table 3(table3), ·, and table N (table).
Next, in step 106, the correspondingly generating a plurality of bypass table identifiers according to the plurality of data table identifiers may include the following sub-steps:
and a substep 1, correspondingly generating a plurality of bypass table identifications according to a first preset rule and the plurality of data table identifications.
The data table identifier may be a data table name or a data table number, and correspondingly, the bypass table identifier may be a bypass table name or a bypass table number, and the generated bypass table identifier is different from the data table identifier corresponding thereto.
As an example, a plurality of bypass table names may be generated from the plurality of data tables according to a first preset naming convention. As shown in fig. 3, if the names of the data tables are in the data table list 24: table 1(table1), table 2(table2), table 3(table3), ·, and table n (tablen), the following bypass table names can be generated respectively: copy table 1(table1_ copy), copy table 2(table2_ copy), copy table 3(table3_ copy), ·, copy table N (tableen _ copy). That is, the first preset naming rule is: the "copy" word is added after the data table name. It is understood that the first preset naming convention can be varied and is not limited to the example shown.
And a substep 2 of storing the mapping relation between the plurality of data table identifications and the plurality of bypass table identifications.
As shown in fig. 3, the mapping relationship between the plurality of data table names and the plurality of bypass table names may be saved in the second preset file 25. The second preset file 25 may also be a file in xml format.
And 108, copying the plurality of data tables corresponding to the plurality of data table identifications to obtain a plurality of bypass tables, and correspondingly identifying the plurality of bypass tables by using the plurality of bypass table identifications.
Specifically, step 108 may include the following sub-steps:
and substep 1, generating the copy sentences of the plurality of data tables in batch according to the mapping relation.
For example, as shown in fig. 3, according to the mapping relationship saved in the second preset file 25, copy statements of the plurality of data tables are generated in batch, specifically, like a like statement for copying the plurality of data tables is generated in batch.
And step 2, executing the batch generated copy statement, copying the plurality of data tables corresponding to the plurality of data table identifications to obtain a plurality of bypass tables, and correspondingly identifying the plurality of bypass tables by using the plurality of bypass table identifications.
Specifically, the data tables corresponding to the data table identifiers may be read to a specified storage location (e.g., a database) according to the data table identifiers. Correspondingly, the plurality of copied bypass tables can be stored in a specified storage location (such as a database), and when the data bypass is operated, the corresponding bypass table can be read from the specified storage location according to the identifier of the bypass table in the data bypass file.
For example, as shown in fig. 3, a batch generated copy statement is executed, a plurality of data tables corresponding to the plurality of data table identifiers are copied to obtain a plurality of bypass tables 26, and each bypass table is identified by generating a corresponding bypass table name as described above.
And step 110, copying the plurality of files to obtain a plurality of bypass files.
As an example, step 110 may specifically include the following sub-steps:
and substep 1, determining a plurality of bypass file names corresponding to the plurality of files according to a second preset rule.
The second preset rule may be a second preset naming rule.
For example, if the names of the files are: task File 1(Task File1), Task File 2(Task File2), Task File 3(Task File3), Task File 4(Task File4), ·, and Task File n (Task File), then the multiple bypass File names may be named in sequence: bypass Task File 1(Task File1copy), bypass Task File 2(Task File2copy), bypass Task File 3(Task File3copy), bypass Task File 4(Task File4copy), · and bypass Task File N (Task File N copy). That is, adding "copy" to the back of the task file name as the corresponding bypass file name of the task file.
And a substep 2 of copying the plurality of files to obtain a plurality of bypass files.
In a specific implementation, as shown in fig. 4, by calling a plurality of soft links in the first preset file 23, the contents of the files pointed by the plurality of soft links are obtained and copied, and then stored in the preset bypass file directory 27.
And 3, correspondingly naming the bypass files by using the bypass file names.
As shown in fig. 4, the plurality of bypass files are named according to the bypass file names generated in sub-step 1.
And step 112, replacing the data table identifications in the bypass files with corresponding bypass table identifications to obtain the data bypass of the target data link.
Because the data table identifier in the bypass file obtained by copying the plurality of files is still the data table identifier in the target data link, the creation of the data bypass is not completed yet, and the data table identifiers in the plurality of bypass files need to be further replaced by the corresponding bypass table identifiers generated in step 106, so that the normal operation of the target data link cannot be influenced when the created data bypass is called.
As a specific example, the data table identifiers in the multiple bypass files may be replaced with corresponding bypass table identifiers according to the mapping relationship stored in the second preset file, so as to finally obtain the data bypass of the target data link.
In the method for obtaining a data bypass provided in the embodiment of the present specification, because soft links of a scheduling file and a task file corresponding to a plurality of task nodes included in a target data link may be automatically generated, a file pointed to by the soft link is automatically obtained by calling the soft link, a code of the file pointed to by the soft link is automatically analyzed, a plurality of data table identifiers for which a bypass table needs to be created are obtained, a corresponding bypass table identifier is automatically generated, a plurality of data tables are automatically copied to obtain a bypass table using the bypass table identifier, a file pointed to by the soft link is automatically copied to obtain a bypass file, and finally, the data table identifier in the bypass file is automatically replaced with the corresponding bypass table identifier, so as to obtain a data bypass of the target data link. The whole process does not need manual operation, so that the efficiency of creating the data bypass can be improved.
Optionally, a method for obtaining data bypass provided by an embodiment of this specification may further include:
and calling a preset bypass modifier to modify the service logic on the target task node and/or the data in the target bypass table, wherein the target task node is at least one task node in the plurality of task nodes, and the target bypass table is at least one bypass table in the plurality of bypass tables.
The purpose of this step is to automatically modify the data bypass to ensure the accuracy of the data bypass.
Optionally, as shown in fig. 5, a method for obtaining data bypass provided in an embodiment of the present specification may further include:
step 114, receiving a call request for calling the data bypass, and then proceeding to step 102 to step 112.
In actual practice, the invocation of the data bypass is performed by the data bypass invoker. That is, the invocation request bypasses the invoker's transmission.
And step 115, after the steps 102 to 112 are executed to update the data bypass of the target data link, responding to the calling request.
The method for obtaining the data bypass provided by the embodiments of the present specification can automatically update the data bypass of the target data link according to the latest target data link (because the target data link may change) when the data bypass of the target data link is called each time, so as to obtain a bypass test result more conforming to the current actual situation.
The above is a description of a method for obtaining data bypass applied to a server in the present specification, and the electronic device provided in the present specification is described below.
Fig. 6 is a schematic structural diagram of an electronic device provided in an embodiment of the present specification. Referring to fig. 6, at a hardware level, the electronic device includes a processor, and optionally further includes an internal bus, a network interface, and a memory. The Memory may include a Memory, such as a Random-Access Memory (RAM), and may further include a non-volatile Memory, such as at least 1 disk Memory. Of course, the electronic device may also include hardware required for other services.
The processor, the network interface, and the memory may be connected to each other via an internal bus, which may be an ISA (Industry Standard Architecture) bus, a PCI (peripheral component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one double-headed arrow is shown in FIG. 6, but that does not indicate only one bus or one type of bus.
And the memory is used for storing programs. In particular, the program may include program code comprising computer operating instructions. The memory may include both memory and non-volatile storage and provides instructions and data to the processor.
The processor reads the corresponding computer program from the nonvolatile memory into the memory and then runs the computer program, and a device for obtaining the data bypass is formed on the logic level. The processor is used for executing the program stored in the memory and is specifically used for executing the following operations:
respectively generating soft links aiming at a scheduling file corresponding to a plurality of task nodes and a plurality of task files to obtain a plurality of soft links, wherein the plurality of task nodes are task nodes contained in a target data link of a data bypass to be created;
calling the soft links to read files pointed by the soft links;
analyzing the codes of the files, acquiring a plurality of data table identifications corresponding to the task nodes, and correspondingly generating a plurality of bypass table identifications according to the data table identifications;
copying a plurality of data tables corresponding to the plurality of data table identifications to obtain a plurality of bypass tables, and correspondingly identifying the plurality of bypass tables by using the plurality of bypass table identifications;
copying the plurality of files to obtain a plurality of bypass files;
and replacing the data table identifications in the bypass files with corresponding bypass table identifications to obtain the data bypass of the target data link.
The method for obtaining data bypass disclosed in the embodiment of fig. 1 or fig. 5 in the present specification may be applied to a processor, or may be implemented by a processor. The processor may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or instructions in the form of software. The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete gates or transistor logic devices, discrete hardware components. The various methods, steps and logic blocks disclosed in one or more embodiments of the present specification may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with one or more embodiments of the present disclosure may be embodied directly in hardware, in a software module executed by a hardware decoding processor, or in a combination of the hardware and software modules executed by a hardware decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in a memory, and a processor reads information in the memory and completes the steps of the method in combination with hardware of the processor.
The electronic device may further perform the method for obtaining data bypass of fig. 1 or fig. 5, which is not described herein again.
Of course, besides the software implementation, the electronic device in this specification does not exclude other implementations, such as logic devices or a combination of software and hardware, and the like, that is, the execution subject of the following processing flow is not limited to each logic unit, and may also be hardware or logic devices.
This specification embodiment also proposes a computer-readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by a portable electronic device comprising a plurality of application programs, are capable of causing the portable electronic device to perform the method of the embodiment shown in fig. 2, and in particular to perform the following operations:
respectively generating soft links aiming at a scheduling file corresponding to a plurality of task nodes and a plurality of task files to obtain a plurality of soft links, wherein the plurality of task nodes are task nodes contained in a target data link of a data bypass to be created;
calling the soft links to read files pointed by the soft links;
analyzing the codes of the files, acquiring a plurality of data table identifications corresponding to the task nodes, and correspondingly generating a plurality of bypass table identifications according to the data table identifications;
copying a plurality of data tables corresponding to the plurality of data table identifications to obtain a plurality of bypass tables, and correspondingly identifying the plurality of bypass tables by using the plurality of bypass table identifications;
copying the plurality of files to obtain a plurality of bypass files;
and replacing the data table identifications in the bypass files with corresponding bypass table identifications to obtain the data bypass of the target data link.
The following is a description of the apparatus for obtaining data bypass provided in this specification.
As shown in fig. 7, an embodiment of the present specification provides an apparatus 700 for obtaining data bypass, and in a software implementation, the apparatus 700 for obtaining data bypass may include a soft link generation module 701, a soft link calling module 702, an identifier generation module 703, a bypass table obtaining module 704, a bypass file obtaining module 705, and a table replacing module 706.
A soft link generating module 701, configured to generate soft links for a scheduling file corresponding to a plurality of task nodes and the plurality of task files respectively to obtain the plurality of soft links, where the plurality of task nodes are task nodes included in a target data link of a data bypass to be created.
Optionally, the apparatus 700 for obtaining data bypass may further include: the device comprises a first determination module and a second determination module.
A first determining module, configured to determine, before the scheduling file and the task files corresponding to the plurality of task nodes respectively generate soft links, a dependency relationship between the plurality of task nodes and the task nodes included in the target data link.
And the second determining module is used for determining the scheduling files and the plurality of task files corresponding to the plurality of task nodes according to the dependency relationship.
Optionally, the soft link generating module 701 is further configured to store the plurality of soft links in a first preset file.
A soft link calling module 702, configured to call the soft links to read the files pointed to by the soft links.
Optionally, the soft link invoking module 702 may be specifically configured to invoke the soft links from the first preset file to read files pointed to by the soft links.
The identifier generating module 703 is configured to analyze the codes of the multiple files, obtain multiple data table identifiers corresponding to the multiple task nodes, and generate multiple bypass table identifiers correspondingly according to the multiple data table identifiers.
The plurality of task nodes at least comprise a starting task node and an ending task node.
Correspondingly, the identifier generating module 703 is specifically configured to, according to the dependency relationship among the task nodes, sequentially analyze codes of the files from the file indicated by the start task node to the file indicated by the end task node, and obtain the identifiers of the data tables corresponding to the task nodes.
More specifically, the identifier generating module 703 is configured to generate a plurality of bypass table identifiers correspondingly according to a first preset rule and the plurality of data table identifiers; and storing the mapping relation between the plurality of data table identifications and the plurality of bypass table identifications.
A bypass table obtaining module 704, configured to copy the multiple data tables corresponding to the multiple data table identifiers to obtain multiple bypass tables, and identify the multiple bypass tables by using the multiple bypass table identifiers correspondingly.
Specifically, the bypass table obtaining module 704 is specifically configured to generate the copy statements of the multiple data tables in batch according to the mapping relationship; and executing the batch generated copy statement, copying the plurality of data tables corresponding to the plurality of data table identifications to obtain a plurality of bypass tables, and correspondingly identifying the plurality of bypass tables by using the plurality of bypass table identifications.
A bypass file obtaining module 705, configured to copy the multiple files to obtain multiple bypass files.
Specifically, the bypass file obtaining module 705 may be configured to:
determining a plurality of bypass file names corresponding to the plurality of files according to a second preset rule;
copying the plurality of files to obtain a plurality of bypass files;
and correspondingly naming the bypass files by using the bypass file names.
A table replacing module 706, configured to replace the data table identifier in the multiple bypass files with a corresponding bypass table identifier, so as to obtain a data bypass of the target data link.
Specifically, the table replacing module 706 may be configured to replace the data table identifiers in the multiple bypass files with corresponding bypass table identifiers according to the mapping relationship, so as to obtain the data bypass of the target data link.
The apparatus 700 for obtaining a data bypass according to the embodiment shown in fig. 7 may automatically generate soft links of a scheduling file and a task file corresponding to a plurality of task nodes included in a target data link, automatically obtain a file pointed by the soft link by calling the soft links, automatically analyze a code of the file pointed by the soft links to obtain a plurality of data table identifiers requiring creation of a bypass table, automatically generate corresponding bypass table identifiers, automatically copy the plurality of data tables to obtain a bypass table using the bypass table identifiers, automatically copy the file pointed by the soft links to obtain the bypass file, and finally automatically replace the data table identifiers in the bypass file with the corresponding bypass table identifiers to obtain the data bypass of the target data link. The whole process does not need manual operation, so that the efficiency of creating the data bypass can be improved.
It should be noted that, the apparatus 700 for obtaining data bypass can implement the method in the embodiment of the method in fig. 1, and specific reference may be made to the method for obtaining data bypass in the embodiment shown in fig. 1, which is not described again.
Optionally, a termination 700 for obtaining data bypass provided by an embodiment of the present specification may further include:
a bypass modification module, configured to, after the data tables in the bypass files are replaced with corresponding bypass tables and before a data bypass of the target data link is obtained, invoke a preset bypass modifier to modify service logic on a target task node and/or data in a target bypass table, where the target task node is at least one task node in the task nodes, and the target bypass table is at least one of the bypass tables.
The purpose of introducing the bypass modification module is to automatically modify the data bypass to ensure the accuracy of the data bypass.
Optionally, as shown in fig. 8, an embodiment of the present specification provides an apparatus 700 for obtaining data bypass, which may further include: a receiving module 707 and a response module 708.
A receiving module 707, configured to receive a call request for calling the data bypass.
A response module 708, configured to respond to the call request after triggering the soft link generation module 701, the soft link calling module 702, the identifier generation module 703, the bypass table obtaining module 704, the bypass file obtaining module 705, and the table replacement module 706 to update the data bypass of the target data link.
The apparatus 700 for obtaining data bypass according to the embodiment shown in fig. 8 may update the data bypass of the target data link according to the latest target data link (because the target data link may change) automatically each time the data bypass of the target data link is invoked, so as to obtain a bypass test result that better meets the current actual situation.
It should be noted that the apparatus 700 for obtaining data bypass shown in fig. 8 can implement the method in the embodiment of the method shown in fig. 5, and specifically refer to the method for obtaining data bypass in the embodiment shown in fig. 5, which is not described again.
While certain embodiments of the present disclosure have been described above, other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the apparatus embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
In short, the above description is only a preferred embodiment of the present disclosure, and is not intended to limit the scope of the present disclosure. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of one or more embodiments of the present disclosure should be included in the scope of protection of one or more embodiments of the present disclosure.
The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.