CN110134702A - Data flow joining method, device, equipment and storage medium - Google Patents
Data flow joining method, device, equipment and storage medium Download PDFInfo
- Publication number
- CN110134702A CN110134702A CN201910412910.3A CN201910412910A CN110134702A CN 110134702 A CN110134702 A CN 110134702A CN 201910412910 A CN201910412910 A CN 201910412910A CN 110134702 A CN110134702 A CN 110134702A
- Authority
- CN
- China
- Prior art keywords
- media information
- data source
- data
- current media
- external storage
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/1805—Append-only file systems, e.g. using logs or journals to store data
- G06F16/1815—Journaling file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
- G06F16/2379—Updates performed during online database operations; commit processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24568—Data stream processing; Continuous queries
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Debugging And Monitoring (AREA)
Abstract
The embodiment of the invention discloses a kind of data flow joining method, device, equipment and storage mediums.This method comprises: obtaining at least three data Source logs of current media information from least three data sources;Determine the line unit of the current media information;Wherein, there are unique mapping relations between media communication and line unit;It, will be in same a line of at least three data source logs write-in external storage of the current media information according to the line unit of the current media information.Multiple row characteristic of the embodiment of the present invention based on table in memory, by being written a plurality of data flow of same media communication in a line, realize effectively integrating for the same a plurality of data flow of media communication in splicing, in order in the subsequent data analysis of splicing, multiple data Source logs of media communication are quickly obtained from external storage, to improve splicing efficiency, reduce the resource occupation of splicing.
Description
Technical field
The present embodiments relate to technical field of data processing more particularly to a kind of data flow joining methods, device, equipment
And storage medium.
Background technique
With the fast development of Internet technology, the exhibition method of media communication gradually diversification.In order to excavate medium letter
The dispensing effect of breath needs the background class log to media communication, exposure class log, clicks class log and conversion class log progress
Analysis.Since media communication can there are many dispensing channels, that is, same type log can be there are many data source.Therefore in matchmaker
In the dispensing effect mining process of Jie's information, the data to the multiple data sources of media communication is needed to splice.
Currently, generalling use streaming splicing, i.e., it is based on data flow arrival time using streaming engine, by first number
According in stream write-in external storage, when second data stream reaches, the external storage and latter is written into second data flow
In a external storage, and so on, to be spliced in real time to the two data streams in external storage.If supported a plurality of
Data flow splicing, then need to be aligned the progress between a plurality of data flow, while saving data flow state into memory, benefit
The splicing two-by-two of data flow is supported with multiple operators.
However, the splicing for a plurality of data flow, the timeliness of available data stream connecting method is lower, disappears for system resource
It consumes larger, can not support the real-time splicing of a plurality of data flow.
Summary of the invention
The embodiment of the invention provides a kind of data flow joining method, device, equipment and storage mediums, can be in data flow
The a plurality of data flow of same media communication is effectively integrated in splicing.
In a first aspect, it is applied to External memory equipment the embodiment of the invention provides a kind of data flow joining method, it is described
Method includes:
At least three data Source logs of current media information are obtained from least three data sources;
Determine the line unit of the current media information;Wherein, there are unique mapping relations between media communication and line unit;
According to the line unit of the current media information, at least three data source logs of the current media information are written
In same a line of external storage.
Second aspect, the embodiment of the invention provides a kind of data flow splicing apparatus, are configured at External memory equipment, described
Device includes:
Data flow obtains module, for obtaining at least three data sources of current media information from least three data sources
Log;
Medium line unit determining module, for determining the line unit of the current media information;Wherein, media communication and line unit it
Between have unique mapping relations;
Data flow writing module, for the line unit according to the current media information, extremely by the current media information
In same a line of few three data Source logs write-in external storage.
The third aspect, the embodiment of the invention provides a kind of equipment, comprising:
One or more processors;
Memory, for storing one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of processing
Device realizes data flow joining method described in any embodiment of that present invention.
Fourth aspect, the embodiment of the invention provides a kind of computer readable storage mediums, are stored thereon with computer journey
Sequence realizes data flow joining method described in any embodiment of that present invention when the program is executed by processor.
The embodiment of the present invention obtains at least three data sources of current media information in the splicing of a plurality of data flow
Log is deposited at least three data source logs write-in outside of same media communication based on the line unit of current media information association
In same a line of storage.Multiple row characteristic of the embodiment of the present invention based on table in memory, by by a plurality of of same media communication
Data flow write-in is with effectively integrating for the same a plurality of data flow of media communication in splicing in a line, is realized, in order to spell
In the subsequent data analysis of termination process, multiple data Source logs of media communication are quickly obtained from external storage, to improve
Splicing efficiency, reduces the resource occupation of splicing.
Detailed description of the invention
Fig. 1 is a kind of flow chart for data flow joining method that the embodiment of the present invention one provides;
Fig. 2 is a kind of flow chart of data flow joining method provided by Embodiment 2 of the present invention;
Fig. 3 is that multi-source data stream provided by Embodiment 2 of the present invention splices schematic diagram;
Fig. 4 is a kind of structural schematic diagram for data flow splicing apparatus that the embodiment of the present invention three provides;
Fig. 5 is a kind of structural schematic diagram for equipment that the embodiment of the present invention four provides.
Specific embodiment
The embodiment of the present invention is described in further detail with reference to the accompanying drawings and examples.It is understood that this
Locate described specific embodiment and is used only for explaining the embodiment of the present invention, rather than limitation of the invention.It further needs exist for
Bright, only parts related to embodiments of the present invention are shown for ease of description, in attached drawing rather than entire infrastructure.
It also should be noted that illustrate only part relevant to the application for ease of description, in attached drawing rather than
Full content.It should be mentioned that some exemplary embodiments are described before exemplary embodiment is discussed in greater detail
At the processing or method described as flow chart.Although operations (or step) are described as the processing of sequence by flow chart,
It is that many of these operations can be implemented concurrently, concomitantly or simultaneously.In addition, the sequence of operations can be by again
It arranges.The processing can be terminated when its operations are completed, it is also possible to have the additional step being not included in attached drawing.
The processing can correspond to method, function, regulation, subroutine, subprogram etc..
Embodiment one
Fig. 1 is a kind of flow chart for data flow joining method that the embodiment of the present invention one provides, and the present embodiment is applicable to
The case where splicing to multi-source data stream, this method can be executed by External memory equipment, and this method can be by a kind of data flow
Splicing apparatus executes, which can be realized by the way of software and/or hardware, be preferably arranged in external storage and set
It is standby.This method specifically includes as follows:
S110, at least three data Source logs that current media information is obtained from least three data sources.
In the specific embodiment of the invention, media communication refers to the information for publicizing or propagating, and may include advertisement, shadow
Depending on resource etc..Data source, which refers to, provides the device or original media of current media information, may include server, website platform
Deng.Correspondingly, data Source log can be generated under each data source, for recording user behavior.For example, in advertisement field, it can
To publicize same advertisement dispensing in different websites, while website can generate backstage log, exposure day as data source
The web log files such as will, click logs and conversion log, the physics for recording advertisement respectively launch record, reflect advertisement exposure journey
Degree, record user access, browsing, click behavior and user are changed into the conversion behaviors such as member or client by advertisement.Cause
This, data Source log of the media communication under different data sources can help webmaster, operation personnel, extension worker etc. real
When obtain Web Site Traffic Information, and provide the number of web analytics from traffic source, web site contents, site visitor's characteristic etc. are many-sided
According to foundation.To help to improve website traffic, website user's experience is promoted, allows visitor more to precipitate and becomes member or visitor
Family obtains maximized income by the investment of less media communication.
In the present embodiment, it is only able to achieve two data streams for the prior art and splices in real time, and can not support a plurality of data
The technical problem spliced in real time is flowed, the current media communication to be analyzed is obtained from least three data sources of its dispensing
Take at least three data Source logs of current media information.Wherein, at least one data Source log can be generated in each data source.
Illustratively, it is assumed that using advertisement A as current media communication to be analyzed, it is determined that at least three data that advertisement A is launched
Source, such as website A, website B and website C then obtain the correlation log of advertisement A from the server of this website respectively.
S120, the line unit for determining current media information.
In the specific embodiment of the invention, line unit refers to the unique identification of media communication, can be using in media communication
The information such as key message, timestamp indicate.There are unique mapping relations between media communication and line unit, be convenient for multi-source data stream
The quick search of data after integration.
Illustratively, the unified time stamp of current media information is selected from the timestamp of at least three data Source logs,
Unified time stamp can be the first timestamp at least three data Source logs, combine current media information keyword and
The unified time of current media information stabs, the line unit as current media information.
S130, according to the line unit of current media information, will current media information at least three data source logs write-in it is outer
In same a line of portion's storage.
It, can be true based on the unique mapping relations having between media communication and line unit in the specific embodiment of the invention
Determine the row of current media information association in external storage.And the multiple row characteristic based on table in memory, it can determine each data
Source associated column in the external storage row.Wherein, if in external storage there is no with the associated column in any data source, outside
Configuration and the associated column of the data source in portion's storage, to update the mapping relations in external storage between column and data source.To
By at least three data Source logs of current media information, external storage is respectively written into at least three column of a line.Specifically,
Each data Source log of current media information can be written in the associated column of the data source as key assignments, it can also will be current
Data source association is written collectively as key assignments in each data Source log of media communication and the timestamp of the data Source log
Column in, to complete the integration of multi-source data stream in splicing.To ring in the data analysis process of splicing
At least three should be inquired in the row of target media information association in the splicing request for the target media information that splicing module is sent
Data Source log;At least three data Source logs arrived to splicing module feedback query carry out multi-source data stream for splicing module
Splicing.
The technical solution of the present embodiment obtains at least the three of current media information in the splicing of a plurality of data flow
A data Source log, based on the line unit of current media information association, by least three data source log writes of same media communication
Enter in same a line of external storage.Multiple row characteristic of the embodiment of the present invention based on table in memory, by believing same medium
The a plurality of data flow write-in of breath in a line, realizing effectively integrating for the same a plurality of data flow of media communication in splicing,
In order to quickly obtain multiple data Source logs of media communication from external storage in the subsequent data analysis of splicing,
To improve splicing efficiency, reduce the resource occupation of splicing.
Embodiment two
The present embodiment on the basis of the above embodiment 1, provides a preferred implementation side of data flow joining method
Formula can carry out the disposable acquisition and disposable spelling of multi-source data stream based on the multi-source data stream integrated in external storage
It connects.Fig. 2 be a kind of flow chart of data flow joining method provided by Embodiment 2 of the present invention, as shown in Fig. 2, this method include with
It is lower specific:
S210, at least three data Source logs that current media information is obtained from least three data sources.
In the specific embodiment of the invention, the dispensing that there are current media information at least three data sources to carry out information, often
A data source can generate at least one data Source log, and then be launched when carrying out data flow splicing to current media information with realizing
When effect analysis, data Source log can be obtained respectively from least three data sources of current media information, to obtain at least
Three data Source logs.
S220, the unified time stamp that current media information is selected from the timestamp of at least three data Source logs.
In the specific embodiment of the invention, different data Source log has its respective generation progress, corresponding timestamp
Illustrate the generation time of data Source log.By parsing at least three data Source logs, each data Source log is determined
Timestamp, one can be selected from the timestamp of at least three data Source logs, as current media information it is unified when
Between stab.Illustratively, by the first timestamp of at least three data Source logs, the unified time as current media information is stabbed.
The unified time stamp of current media information can identify the approximate time node of the current media inter-area traffic interarea got.
The unified time stamp of S230, the keyword for combining current media information and current media information, as current media
The line unit of information.
In the specific embodiment of the invention, the keyword of current media information can be that can be identified for that or unique identification is current
Field of media communication, such as advertised name, advertisement version etc..Line unit refers to the unique identification of current media information, medium letter
There are unique mapping relations between breath and line unit.It can be by the unification of the keyword of current media information and current media information
Timestamp is combined, collectively as the line unit of current media information, for multiple data under unique identification current media information
The log in source.
S240, it is closed according to unique mapping in the line unit and external storage of current media information between column and data source
System, by at least three data source logs write-in external storage of current media information at least three column of a line.
It, can be true based on the unique mapping relations having between media communication and line unit in the specific embodiment of the invention
Determine the row of current media information association in external storage.Based on the multiple row characteristic of table in memory, each data source can be determined
The associated column in the external storage row.To using at least three data Source logs of current media information as key assignments, i.e., with
The corresponding storing data content of line unit, be respectively written into external storage this in a line it is associated at least three column in.
Optionally, by the timestamp of each the data Source log and the data Source log of current media information, as key assignments
It is written in the associated column of the data source.
It, can also be in the writing process of data, by each data source in addition to the unified time stamp in line unit in the present embodiment
The timestamp of log is used as key assignments together with data Source log, is written in the associated column of external storage.To in a line not
Data Source log in same column has respective associated timestamp information, convenient for the further inquiry and analysis of data.
Optionally, if in external storage there is no with the associated column in any data source, in external storage configuration and should
The associated column of data source, to update the mapping relations in external storage between column and data source;Data source log write-in is outer
In the column of portion's storage configuration.
In the present embodiment, media communication data flow for the first time splicing when or media communication there are new dispensing data
When source, may and the associated column of the data source be not present in external storage.If being therefore not present and appointing in external storage
The associated column of one data source, then configuration and the associated column of the data source in external storage, and update in external storage column and number
According to the mapping relations between source.To determine the row of current media information association based on line unit, which is written outer
Portion stores in the column newly configured in the row.
Illustratively, the column feature based on table in memory, integrates multi-source data stream in splicing
The results are shown in Table 1.Wherein, RowKey column indicate the line unit constituted with time stamp T imestamp_logkey.
ColumnFamily indicates column family, and each column Column indicates one data source of mark in column family.For example, same company
The corresponding data flow of different web sites Column under ColumnFamily.As it can be seen from table 1 in data flow splicing, base
In line unit, the data Source log that different data sources under same media communication are generated uniformly is stored in same a line as key assignments
In different lines, and then the data Source log for obtaining all data sources under the media communication can be disposably inquired based on line unit.
Multi-source data stream integrates sample table in 1 data flow splicing of table
The splicing request of S250, the target media information sent in response to splicing module, in target media information association
At least three data Source logs are inquired in row.
In the specific embodiment of the invention, splicing module refers to the module in splicing for data analysis, splices mould
Block can be arranged in the computer equipment different from external storage.Splicing module is by the external storage for being integrated with data
The splicing request for sending target media information, integral data to be analyzed is obtained with this.Wherein, it may include mesh in splicing request
Mark media communication or line unit information.Correspondingly, the splicing for the target media information that external-device response is sent in splicing module is asked
It asks, is indexed based on line unit, by splicing request being compared with line unit, is inquired with this in the table of external storage
Row where target media information, thus from at least three data source days disposably obtained in the row under target media information
Will.
S260, at least three data Source logs arrived to splicing module feedback query carry out multi-source data for splicing module
Stream splicing.
In the specific embodiment of the invention, in data flow splicing, External memory equipment will inquire the target obtained
The total data Source log of media communication, disposably feeds back to splicing module, thus total data of the splicing module according to acquisition
Stream disposably splice.
Specifically, Fig. 3 is that multi-source data stream splices schematic diagram.As shown in figure 3, in the Data Integration of data flow splicing
The data flow log of multiple data sources of same media communication is stored in outer by the stage based on the column characteristic in memory table
In the different lines of same a line of portion's storage, the integration of multi-source data stream is realized by external storage.In data flow splicing
Data analysis phase, external memory receives the splicing request that splicing module is sent, based on the timestamp in line unit, according to data
Degree of flowing into can quickly inquire the multi-source data stream of target media information from external storage.To which splicing module can be primary
Property obtain multiple data Source logs of target media information, and each data flow progress detected according to water table management device, and
The preconfigured splicing condition of business demand, disposably splices multiple data Source logs, and can support day rank
Splice window.
The technical solution of the present embodiment obtains at least the three of current media information in the splicing of a plurality of data flow
A data Source log, based on the line unit of current media information association, by least three data source log writes of same media communication
Enter in same a line of external storage, so that multiple data Source logs of target media information are inquired in data flow splicing, it will be more
A data Source log is disposably supplied to splicing module, so that splicing module is disposably spliced.The embodiment of the present invention is based on
The multiple row characteristic of table in memory, by by the write-in of a plurality of data flow of same media communication in a line, realizing splicing
The a plurality of data flow of same media communication effectively integrates in the process, in order in the subsequent data analysis of splicing, quickly from
Multiple data Source logs that media communication is obtained in external storage, reduce the splicing number of data flow, to improve splicing effect
Rate reduces the resource occupation of splicing.
Embodiment three
Fig. 4 is a kind of structural schematic diagram for data flow splicing apparatus that the embodiment of the present invention three provides, and the present embodiment can fit
The case where for splicing to multi-source data stream, the device can realize data flow splicing side described in any embodiment of that present invention
Method.The device specifically includes:
Data flow obtains module 410, for obtaining at least three numbers of current media information from least three data sources
According to Source log;
Medium line unit determining module 420, for determining the line unit of the current media information;Wherein, media communication and row
There are unique mapping relations between key;
Data flow writing module 430, for the line unit according to the current media information, by the current media information
At least three data source logs are written in same a line of external storage.
Optionally, the medium line unit determining module 420 is specifically used for:
The unified time stamp of current media information is selected from the timestamp of at least three data Source log;
The keyword of the current media information and the unified time stamp of the current media information are combined, as current matchmaker
The line unit of Jie's information.
Optionally, the medium line unit determining module 420 is specifically used for:
By the first timestamp of at least three data Source log, the unified time as current media information is stabbed.
Optionally, the data flow writing module 430 is specifically used for:
According to unique mapping relations in the external storage between column and data source, extremely by the current media information
The external storage is written at least three column of a line in few three data Source logs.
Optionally, the data flow writing module 430 is specifically used for:
If in the external storage there is no with the associated column in any data source, in the external storage configuration and should
The associated column of data source, to update the mapping relations in external storage between column and data source;
The data source log is written in the column of the external storage configuration.
Optionally, the data flow writing module 430 is specifically used for:
By the timestamp of each the data Source log and the data Source log of the current media information, be written as key assignments
In the associated column of the data source.
Further, described device further includes Data stream query module 440;The Data stream query module 440 is specifically used
In:
Described by it in same a line of at least three data source logs write-in external storage of the current media information
Afterwards, in response to splicing module send target media information splicing request, inquired in the row of target media information association to
Few three data Source logs;
At least three data Source logs arrived to the splicing module feedback query carry out multi-source number for the splicing module
Splice according to stream.
The technical solution of the present embodiment, by the mutual cooperation between each functional module, realize data flow acquisition,
The functions such as the determination of line unit, the write-in of key assignments and the inquiry of data flow.The embodiment of the present invention based in memory table it is more
Column characteristic, by believing a plurality of data flow write-in of same media communication with same medium in splicing in a line, is realized
Effectively integrating for a plurality of data flow is ceased, in order to quickly obtain matchmaker from external storage in the subsequent data analysis of splicing
Multiple data Source logs of Jie's information, the splicing number for reducing data flow reduce splicing to improve splicing efficiency
Resource occupation.
Example IV
Fig. 5 is a kind of structural schematic diagram for equipment that the embodiment of the present invention four provides, and Fig. 5, which is shown, to be suitable for being used to realizing this
The block diagram of the example devices of inventive embodiments embodiment.The equipment that Fig. 5 is shown is only an example, should not be to the present invention
The function and use scope of embodiment bring any restrictions.
The equipment 12 that Fig. 5 is shown is only an example, should not function to the embodiment of the present invention and use scope bring
Any restrictions.The equipment 12 is preferably External memory equipment.
As shown in figure 5, equipment 12 is showed in the form of universal computing device.The component of equipment 12 may include but unlimited
In one or more processor 16, system storage 28, different system components (including system storage 28 and processing are connected
Device 16) bus 18.
Bus 18 indicates one of a few class bus structures or a variety of, including memory bus or Memory Controller,
Peripheral bus, graphics acceleration port, processor or the local bus using any bus structures in a variety of bus structures.It lifts
For example, these architectures include but is not limited to industry standard architecture (ISA) bus, microchannel architecture (MAC)
Bus, enhanced isa bus, Video Electronics Standards Association (VESA) local bus and peripheral component interconnection (PCI) bus.
Equipment 12 typically comprises a variety of computer system readable media.These media can be it is any can be by equipment 12
The usable medium of access, including volatile and non-volatile media, moveable and immovable medium.
System storage 28 may include the computer system readable media of form of volatile memory, such as arbitrary access
Memory (RAM) 30 and/or cache memory 32.Equipment 12 may further include it is other it is removable/nonremovable,
Volatile/non-volatile computer system storage medium.Only as an example, storage system 34 can be used for reading and writing irremovable
, non-volatile magnetic media (Fig. 5 do not show, commonly referred to as " hard disk drive ").Although being not shown in Fig. 5, use can be provided
In the disc driver read and write to removable non-volatile magnetic disk (such as " floppy disk "), and to removable anonvolatile optical disk
The CD drive of (such as CD-ROM, DVD-ROM or other optical mediums) read-write.In these cases, each driver can
To be connected by one or more data media interfaces with bus 18.System storage 28 may include that at least one program produces
Product, the program product have one group of (for example, at least one) program module, these program modules are configured to perform of the invention real
Apply the function of each embodiment of example.
Program/utility 40 with one group of (at least one) program module 42 can store and store in such as system
In device 28, such program module 42 includes but is not limited to operating system, one or more application program, other program modules
And program data, it may include the realization of network environment in each of these examples or certain combination.Program module 42
Usually execute the function and/or method in described embodiment of the embodiment of the present invention.
Equipment 12 can also be communicated with one or more external equipments 14 (such as keyboard, sensing equipment, display 24 etc.),
Can also be enabled a user to one or more equipment interacted with the equipment 12 communication, and/or with enable the equipment 12 with
One or more of the other any equipment (such as network interface card, modem etc.) communication for calculating equipment and being communicated.It is this logical
Letter can be carried out by input/output (I/O) interface 22.Also, equipment 12 can also by network adapter 20 and one or
The multiple networks of person (such as local area network (LAN), wide area network (WAN) and/or public network, such as internet) communication.As shown,
Network adapter 20 is communicated by bus 18 with other modules of equipment 12.It should be understood that although not shown in the drawings, can combine
Equipment 12 uses other hardware and/or software module, including but not limited to: microcode, device driver, redundant processor, outer
Portion's disk drive array, RAID system, tape drive and data backup storage system etc..
The program that processor 16 is stored in system storage 28 by operation, thereby executing various function application and number
According to processing, such as realize data flow joining method provided by the embodiment of the present invention.
Embodiment five
The embodiment of the present invention five also provides a kind of computer readable storage medium, be stored thereon with computer program (or
For computer executable instructions), for executing a kind of data flow joining method, this method packet when which is executed by processor
It includes:
At least three data Source logs of current media information are obtained from least three data sources;
Determine the line unit of the current media information;Wherein, there are unique mapping relations between media communication and line unit;
According to the line unit of the current media information, at least three data source logs of the current media information are written
In same a line of external storage.
The computer storage medium of the embodiment of the present invention, can be using any of one or more computer-readable media
Combination.Computer-readable medium can be computer-readable signal media or computer readable storage medium.It is computer-readable
Storage medium for example may be-but not limited to-the system of electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, device or
Device, or any above combination.The more specific example (non exhaustive list) of computer readable storage medium includes: tool
There are electrical connection, the portable computer diskette, hard disk, random access memory (RAM), read-only memory of one or more conducting wires
(ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-
ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.In this document, computer-readable storage
Medium can be any tangible medium for including or store program, which can be commanded execution system, device or device
Using or it is in connection.
Computer-readable signal media may include in a base band or as carrier wave a part propagate data-signal,
Wherein carry computer-readable program code.The data-signal of this propagation can take various forms, including but unlimited
In electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be that computer can
Any computer-readable medium other than storage medium is read, which can send, propagates or transmit and be used for
By the use of instruction execution system, device or device or program in connection.
The program code for including on computer-readable medium can transmit with any suitable medium, including --- but it is unlimited
In wireless, electric wire, optical cable, RF etc. or above-mentioned any appropriate combination.
Can with one or more programming languages or combinations thereof come write for execute the embodiment of the present invention operation
Computer program code, described program design language include object oriented program language-such as Java,
Smalltalk, C++, further include conventional procedural programming language-such as " C " language or similar program design language
Speech.Program code can be executed fully on the user computer, partly be executed on the user computer, as an independence
Software package execute, part on the user computer part execute on the remote computer or completely in remote computer or
It is executed in equipment.In situations involving remote computers, remote computer can pass through the network of any kind --- including
Local area network (LAN) or wide area network (WAN)-are connected to subscriber computer, or, it may be connected to outer computer (such as using
ISP is connected by internet).
Note that the above is only a better embodiment of the present invention and the applied technical principle.It will be appreciated by those skilled in the art that
The invention is not limited to the specific embodiments described herein, be able to carry out for a person skilled in the art it is various it is apparent variation,
It readjusts and substitutes without departing from protection scope of the present invention.Therefore, although being implemented by above embodiments to the present invention
Example is described in further detail, but the embodiment of the present invention is not limited only to above embodiments, is not departing from structure of the present invention
It can also include more other equivalent embodiments in the case where think of, and the scope of the present invention is determined by scope of the appended claims
It is fixed.
Claims (16)
1. a kind of data flow joining method, which is characterized in that be applied to External memory equipment, which comprises
At least three data Source logs of current media information are obtained from least three data sources;
Determine the line unit of the current media information;Wherein, there are unique mapping relations between media communication and line unit;
According to the line unit of the current media information, at least three data source logs of the current media information are written external
In same a line of storage.
2. the method according to claim 1, wherein the line unit of the determination current media information, comprising:
The unified time stamp of current media information is selected from the timestamp of at least three data Source log;
The keyword of the current media information and the unified time stamp of the current media information are combined, is believed as current media
The line unit of breath.
3. according to the method described in claim 2, it is characterized in that, the timestamp from at least three data Source log
The unified time stamp of middle selection current media information, comprising:
By the first timestamp of at least three data Source log, the unified time as current media information is stabbed.
4. the method according to claim 1, wherein at least three data by the current media information
Source log is written in same a line of external storage, comprising:
According to unique mapping relations in the external storage between column and data source, by least the three of the current media information
The external storage is written at least three column of a line in a data Source log.
5. according to the method described in claim 4, it is characterized in that, described arrange between data source according in the external storage
Unique mapping relations, the external storage is written into a line at least three data source logs of the current media information
In at least three column, comprising:
If being not present in the external storage with the associated column in any data source, configuration and the data in the external storage
The associated column in source, to update the mapping relations in external storage between column and data source;
The data source log is written in the column of the external storage configuration.
6. according to the method described in claim 4, it is characterized in that, at least three data by the current media information
Source log is written in same a line of external storage, comprising:
By the timestamp of each the data Source log and the data Source log of the current media information, the number is written as key assignments
According in the associated column in source.
7. the method according to claim 1, wherein at least three numbers by the current media information
After in same a line of Source log write-in external storage, further includes:
In response to splicing module send target media information splicing request, inquired in the row of target media information association to
Few three data Source logs;
At least three data Source logs arrived to the splicing module feedback query carry out multi-source data stream for the splicing module
Splicing.
8. a kind of data flow splicing apparatus, which is characterized in that be configured at External memory equipment, described device includes:
Data flow obtains module, for obtaining at least three data source days of current media information from least three data sources
Will;
Medium line unit determining module, for determining the line unit of the current media information;Wherein, have between media communication and line unit
There are unique mapping relations;
Data flow writing module, for the line unit according to the current media information, by least the three of the current media information
In same a line of a data Source log write-in external storage.
9. device according to claim 8, which is characterized in that the medium line unit determining module is specifically used for:
The unified time stamp of current media information is selected from the timestamp of at least three data Source log;
The keyword of the current media information and the unified time stamp of the current media information are combined, is believed as current media
The line unit of breath.
10. device according to claim 9, which is characterized in that the medium line unit determining module is specifically used for:
By the first timestamp of at least three data Source log, the unified time as current media information is stabbed.
11. device according to claim 8, which is characterized in that the data flow writing module is specifically used for:
According to unique mapping relations in the external storage between column and data source, by least the three of the current media information
The external storage is written at least three column of a line in a data Source log.
12. device according to claim 11, which is characterized in that the data flow writing module is specifically used for:
If being not present in the external storage with the associated column in any data source, configuration and the data in the external storage
The associated column in source, to update the mapping relations in external storage between column and data source;
The data source log is written in the column of the external storage configuration.
13. device according to claim 11, which is characterized in that the data flow writing module is specifically used for:
By the timestamp of each the data Source log and the data Source log of the current media information, the number is written as key assignments
According in the associated column in source.
14. device according to claim 8, which is characterized in that described device further includes Data stream query module, the number
It is as follows for executing according to continuous query module:
After in same a line that external storage is written in at least three data source logs by the current media information, ring
At least three should be inquired in the row of target media information association in the splicing request for the target media information that splicing module is sent
Data Source log;
At least three data Source logs arrived to the splicing module feedback query carry out multi-source data stream for the splicing module
Splicing.
15. a kind of equipment characterized by comprising
One or more processors;
Memory, for storing one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of processors are real
Now such as data flow joining method of any of claims 1-7.
16. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is by processor
Such as data flow joining method of any of claims 1-7 is realized when execution.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910412910.3A CN110134702A (en) | 2019-05-17 | 2019-05-17 | Data flow joining method, device, equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910412910.3A CN110134702A (en) | 2019-05-17 | 2019-05-17 | Data flow joining method, device, equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110134702A true CN110134702A (en) | 2019-08-16 |
Family
ID=67574984
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910412910.3A Pending CN110134702A (en) | 2019-05-17 | 2019-05-17 | Data flow joining method, device, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110134702A (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110502506A (en) * | 2019-08-29 | 2019-11-26 | 北京博睿宏远数据科技股份有限公司 | A kind of data processing method, device, equipment and storage medium |
CN110515954A (en) * | 2019-08-29 | 2019-11-29 | 北京博睿宏远数据科技股份有限公司 | A kind of data processing method, device, equipment and storage medium |
CN111600944A (en) * | 2020-05-12 | 2020-08-28 | 北京锐安科技有限公司 | Data processing method, device, equipment and storage medium |
CN111831383A (en) * | 2020-07-20 | 2020-10-27 | 北京百度网讯科技有限公司 | Window splicing method, device, equipment and storage medium |
CN112434023A (en) * | 2019-08-26 | 2021-03-02 | 长鑫存储技术有限公司 | Process data analysis method and device, storage medium and computer equipment |
CN113127511A (en) * | 2020-01-15 | 2021-07-16 | 百度在线网络技术(北京)有限公司 | Data splicing method and device for multiple data streams, electronic equipment and storage medium |
CN113127512A (en) * | 2020-01-15 | 2021-07-16 | 百度在线网络技术(北京)有限公司 | Data splicing triggering method and device for multiple data streams, electronic equipment and medium |
CN113377809A (en) * | 2021-06-23 | 2021-09-10 | 北京百度网讯科技有限公司 | Data processing method and apparatus, computing device, and medium |
JP2023534347A (en) * | 2021-06-23 | 2023-08-09 | ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド | Data processing method and apparatus, computing equipment and medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102510568A (en) * | 2011-11-22 | 2012-06-20 | 联通宽带业务应用国家工程实验室有限公司 | Internet access data processing system and method for mobile terminal |
CN103685207A (en) * | 2012-09-21 | 2014-03-26 | 百度在线网络技术(北京)有限公司 | System, apparatus, and method for integrating data spanning data sources |
CN103810224A (en) * | 2012-11-15 | 2014-05-21 | 阿里巴巴集团控股有限公司 | Information persistence and query method and device |
CN103870570A (en) * | 2014-03-14 | 2014-06-18 | 广州携智信息科技有限公司 | HBase (Hadoop database) data usability and durability method based on remote log backup |
CN104391910A (en) * | 2014-11-17 | 2015-03-04 | 西安交通大学 | HBase-based tax statistic report storage and calculation method |
CN104951462A (en) * | 2014-03-27 | 2015-09-30 | 国际商业机器公司 | Method and system for managing data base |
US20180189339A1 (en) * | 2016-12-30 | 2018-07-05 | Dropbox, Inc. | Event context enrichment |
-
2019
- 2019-05-17 CN CN201910412910.3A patent/CN110134702A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102510568A (en) * | 2011-11-22 | 2012-06-20 | 联通宽带业务应用国家工程实验室有限公司 | Internet access data processing system and method for mobile terminal |
CN103685207A (en) * | 2012-09-21 | 2014-03-26 | 百度在线网络技术(北京)有限公司 | System, apparatus, and method for integrating data spanning data sources |
CN103810224A (en) * | 2012-11-15 | 2014-05-21 | 阿里巴巴集团控股有限公司 | Information persistence and query method and device |
CN103870570A (en) * | 2014-03-14 | 2014-06-18 | 广州携智信息科技有限公司 | HBase (Hadoop database) data usability and durability method based on remote log backup |
CN104951462A (en) * | 2014-03-27 | 2015-09-30 | 国际商业机器公司 | Method and system for managing data base |
CN104391910A (en) * | 2014-11-17 | 2015-03-04 | 西安交通大学 | HBase-based tax statistic report storage and calculation method |
US20180189339A1 (en) * | 2016-12-30 | 2018-07-05 | Dropbox, Inc. | Event context enrichment |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112434023A (en) * | 2019-08-26 | 2021-03-02 | 长鑫存储技术有限公司 | Process data analysis method and device, storage medium and computer equipment |
CN110515954A (en) * | 2019-08-29 | 2019-11-29 | 北京博睿宏远数据科技股份有限公司 | A kind of data processing method, device, equipment and storage medium |
CN110515954B (en) * | 2019-08-29 | 2023-01-31 | 北京博睿宏远数据科技股份有限公司 | Data processing method, device, equipment and storage medium |
CN110502506A (en) * | 2019-08-29 | 2019-11-26 | 北京博睿宏远数据科技股份有限公司 | A kind of data processing method, device, equipment and storage medium |
CN113127512A (en) * | 2020-01-15 | 2021-07-16 | 百度在线网络技术(北京)有限公司 | Data splicing triggering method and device for multiple data streams, electronic equipment and medium |
CN113127511A (en) * | 2020-01-15 | 2021-07-16 | 百度在线网络技术(北京)有限公司 | Data splicing method and device for multiple data streams, electronic equipment and storage medium |
CN113127511B (en) * | 2020-01-15 | 2023-09-15 | 百度在线网络技术(北京)有限公司 | Multi-data stream data splicing method and device, electronic equipment and storage medium |
CN113127512B (en) * | 2020-01-15 | 2023-09-29 | 百度在线网络技术(北京)有限公司 | Multi-data stream data splicing triggering method and device, electronic equipment and medium |
CN111600944A (en) * | 2020-05-12 | 2020-08-28 | 北京锐安科技有限公司 | Data processing method, device, equipment and storage medium |
CN111600944B (en) * | 2020-05-12 | 2023-02-28 | 北京锐安科技有限公司 | Data processing method, device, equipment and storage medium |
CN111831383A (en) * | 2020-07-20 | 2020-10-27 | 北京百度网讯科技有限公司 | Window splicing method, device, equipment and storage medium |
CN113377809A (en) * | 2021-06-23 | 2021-09-10 | 北京百度网讯科技有限公司 | Data processing method and apparatus, computing device, and medium |
WO2022267368A1 (en) * | 2021-06-23 | 2022-12-29 | 北京百度网讯科技有限公司 | Data processing method and apparatus, and computing device and medium |
JP2023534347A (en) * | 2021-06-23 | 2023-08-09 | ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド | Data processing method and apparatus, computing equipment and medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110134702A (en) | Data flow joining method, device, equipment and storage medium | |
US11238099B2 (en) | Method and device for obtaining answer, and computer device | |
CN108140007B (en) | Securely deploying applications across deployment locations | |
CN110008045B (en) | Method, device and equipment for aggregating microservices and storage medium | |
CN111813804B (en) | Data query method and device, electronic equipment and storage medium | |
CN109495392B (en) | Message conversion processing method and device, electronic equipment and storage medium | |
US10572597B2 (en) | Resolution of acronyms in question answering systems | |
CN114528044B (en) | Interface calling method, device, equipment and medium | |
CN109634764A (en) | Work-flow control method, apparatus, equipment, storage medium and system | |
US10216802B2 (en) | Presenting answers from concept-based representation of a topic oriented pipeline | |
CN111552895B (en) | Page route analysis method, system, equipment and medium in applet application | |
US10380257B2 (en) | Generating answers from concept-based representation of a topic oriented pipeline | |
CN109033456B (en) | Condition query method and device, electronic equipment and storage medium | |
CN109669790A (en) | Data sharing method, device, shared platform and storage medium based on cloud platform | |
US10339205B2 (en) | Efficient handling of bi-directional data | |
CN117271554A (en) | Distributed database view processing method, device, equipment and storage medium | |
CN113220237B (en) | Distributed storage method, device, equipment and storage medium | |
US20210157881A1 (en) | Object oriented self-discovered cognitive chatbot | |
CN112364268A (en) | Resource acquisition method and device, electronic equipment and storage medium | |
CN112288452A (en) | Advertisement preview method and device, electronic equipment and storage medium | |
CN113572809B (en) | Single request source multi-target source data communication method, computer equipment and storage medium | |
US20230419047A1 (en) | Dynamic meeting attendee introduction generation and presentation | |
CN111428544B (en) | Scene recognition method and device, electronic equipment and storage medium | |
CN114339125A (en) | Voice broadcasting method, device, equipment and storage medium | |
US20220253455A1 (en) | Reducing character set conversion |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190816 |
|
RJ01 | Rejection of invention patent application after publication |