GB2518171A - Improvements in or relating to data processing - Google Patents
Improvements in or relating to data processing Download PDFInfo
- Publication number
- GB2518171A GB2518171A GB1316207.8A GB201316207A GB2518171A GB 2518171 A GB2518171 A GB 2518171A GB 201316207 A GB201316207 A GB 201316207A GB 2518171 A GB2518171 A GB 2518171A
- Authority
- GB
- United Kingdom
- Prior art keywords
- data
- available
- region
- grouped
- axis
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G06T11/20—Drawing from basic elements, e.g. lines or circles
- G06T11/206—Drawing of charts or graphs
Landscapes
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Processing Or Creating Images (AREA)
Abstract
The application concerns processing data for display on a scatter plot. For many plots the available pixel resolution will be less than the resolution of the data points to be plotted and there will be a significant overlap of points when they are rendered for display. This overlap can be exploited by grouping data and representing multiple data as a single point of a displayed plot. This application renders the plot data by binning the data to fit the size of an available pixel region of a graphical output device. Data is grouped by breaking an available display area into cells and grouping points which lie within the same cell. The grouped point may be altered visually to show the number of points represented which may be done by storing a size parameter along witht eh coordinates of the the grouped point which may act as an opacity scaling factor. The invention also allows a user to zoom in on specific areas of the plot.
Description
IMPROVEMENTS IN OR RELATING TO DATA PROCESSING
BACKGROUND
The present disclosure relates to improvements in or relating to data processing, and in particular to methods of rendering data in graphical form.
SUMMARY OF THE DISCLOSURE
According to a first aspect of the disclosure there is provided a method of displaying data comprising grouping the data to form a grouped representation of the data, and displaying said grouped representation; wherein said grouping is carried out to fit an available region of a graphical output device.
Optionally said available region is an available pixel region. Alternatively, said available region is an available print region.
Optionally, grouping the data comprises aggregating the data by data binning.
Optionally, the graphical output device is a display screen.
Optionally, the binned data is output as a scatter plot with each datum represented by a graphical symbol.
Optionally a first axis of data to be plotted is binned; and then a second axis of data to be plotted is binned for each bin of the first axis that contains two or more data.
Optionally binning the data along an axis comprises: calculating a number of bins by dividing a number of output device pixels by a characteristic pixel dimension of a graphical symbol used to represent each datum; determining lower and upper bounds for each bin based on a range of an axis to be displayed and the number of bins; and allocating each datum to a bin depending on its value for that axis.
Optionally the range of an axis to be displayed comprises a lower axis bounnd and an upper axis bound, as defined by the data.
Optionally each bin has a size parameter associated with it that represents the number of data that the bin comprises.
Optionally the grouped binned data are altered visually according to the size parameter, to represent the number of data that each bin comprises.
Optionally said visual alteration comprises varying an opacity value of a grouped datum.
Optionally, a grouped representation is redrawn according to new boundaries in response to a zoom command.
According to a second aspect of the disclosure there is provided a system for displaying data comprising: a database storing a plurality of data; a graphical output device comprising means to present data in a pixel region; a processor arranged to group the data to form a grouped representation of the data, and to provide commands to display said grouped representation; wherein said grouping is carried out to fit an available region of the graphical output device.
Optionally said available region is an available pixel region. Alternatively, said available region is an available print region.
Optionally, the processor comprises a data binning component.
Optionally, the graphical output device is a display screen.
Optionally, the display screen is provided as a component part of a computing device.
Optionally, the system comprises a server comprising said database and said processor, and said graphical output device is a client of the server or is provided as part of a client device of the server.
Optionally, the server is a web server and the client device runs a browser application for accessing the data.
According to a third aspect of the disclosure there is provided a computer program product that includes instructions that when run on a computer, enable it to bin data to fit an available region of a graphical output device.
Optionally said available region is an available pixel region. Alternatively, said available region is an available print region.
According to a fourth aspect of the disclosure there is provided a computer program product that includes instructions that when run on a computer, enable it to request data from a database which is binned to fit an available region of a graphical output device associated with the computer running the instructions.
Optionally said available region is an available pixel region. Alternatively, said available region is an available print region.
The computer program products of the third and fourth aspects may be stored on or transmitted over as one or more instructions or code on a computer-readable medium.
Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a computer. By way of example such computer-readable media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computen Also, any connection is properly termed a computer-readable medium. For example, if the software is transmitted from a website, servei; or other remote source using a coaxial cable, fibre optic cable, twisted pair, digital subscriber line (DSLJ, or wireless technologies such as infra-red, radio, and microwave, then the coaxial cable, fibre optic cable, twisted pair, DSL, or wireless technologies such as infra-red, radio, and microwave are included in the definition of medium. Disk and disc, as used herein, includes compact disc [CD], laser disc, optical disc, digital versatile disc [DVDJ, floppy disk and blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media. The instructions or code associated with a computer-readable medium of the computer program product maybe executed by a computer, e.g., by one or more processors, such as one or more digital signal processors [DSPs), general purpose microprocessors, ASICs, FPGAs, or other equivalent integrated or discrete logic circuitry.
BRIEF DESCRIPTION OF THE DRAWINGS
The disclosure will be described below, by way of example only, with reference to the accompanying drawings, in which: Figure lillustrates the grouping together of data depending on which bin they lie in; and Figure 2 an example application of the disclosure for illustrative purposes.
DETAILED DESCRIPTION
Data is only useful if it can be processed and interpreted, and for this reason graphical representations of data are key tools for helping humans review and understand patterns and trends represented by the data. One popular graphical representation is a scatter plot, where variables are plotted along axes of the plot and each datum is represented as a point on the plot. The representation of each datum may take the form of a graphical symbol, such as a circle, square, triangle or generally any desired shape. Different symbols can be used to represent different data sets.
Graphical representations of data are presented by use of various display technologies, including CRT, LCD or LED displays to name but some of many technologies that are available.
The present disclosure is not limited to any type of display technology. A display will have a
S
set of display pixels for displaying information to a user, which defines the resolution of the display For example, a popular high definition display resolution desktop LCD monitor used for a personal computer maybe is 1366 x 768 pixels in a 16:9 aspect ratio. There are a wide range of display resolutions across desktop monitors and screens for mobile computing devices such as cellular telephones, tablet computing devices, laptops and so on.
As an example, we can imagine a scatter plot with 10,000 points that is 700 pixels wide by 500 pixels tall, and where each point is represented by a circular graphical symbol having a radius of three pixels. It is to be noted that real examples may have many more times that 10,000 data points.
Rendering all 10,000 points on one graph would represent a high resource cost for a computer that is performing the rendering, which could result in either a slow or unresponsive visualization, or even worse, a crash.
Furthermore, users often like to compare multiple graphs side by side for comparison -say up to nine for example. If multiple graphs of similar size need to be plotted, this would represent a still further degradation of performance of a computer rendering the graphs.
For many plots, the available pixel resolution will be less than the resolution of the data points to be plotted. There will be a significant overlap of points when they are rendered for display In the example mentioned above [10,000 data points to be plotted on a pixel area of 700 by 500 pixels], there will most likely be significant overlap between them when being rendered for display as the points in a realistic plot will most likely never be uniformly distributed.
This overlap can be exploited by grouping data and representing multiple data as a single point on a displayed plot. Grouping points based on the distance between each other works well but requires each point to be checked against eveiy other point which is computationally expensive, so the overall process of obtaining the data, filtering it and then rendering it would take a relatively long time.
The disclosure provides for the rendering data by binning the data to fit the size of an available pixel region of a graphical output device. Data are grouped by breaking an available display area into cells and grouping points which lie within the same cell together.
This is illustrated in figure 1 which illustrates a selected sub-portion of a display screen. In this example a data point is grapically represented by circle which has a radius of three pixels on a display screen when rendered. Therefore, the display is broken into cells of three pixels square and points lying within each cell are grouped together and plotted as a single point.
When multiple points are grouped together the resulting grouped points assumes the X & Y coordinate of the first point for that group or alternatively the center of the cell for that group.
This computationally cheap method is sufficient given the level of overlap that would typically be present. More complex schemes could be employed such as calculating an average of all the group coordinates but it is preferred to avoid this complexity as for the vast majority of cases it would represent an unnecessary waste of Cpu time as the visual difference would be so minor.
In some embodiments, a grouped point may be altered visually to show the number of points it represents, thus creating the illusion that there is more than one point being rendered. This may be achieved by storing a size parameter along with the coordinates of the grouped point, where the size parameter represents the number of raw data that are combined to form a grouped datum. The size parameter can then be used to define a style to be applied to the displayed plot points, for example acting as an opacity scaling factor. As an illustration, if a single non-grouped point is rendered with an opacity of 0.25, a grouped-point representing 3 points may be rendered with an opacity of 0.75. As the grouped point is rendered at 3 times the darkness of a regular point from a distance it would look almost indistinguishable from 3 overlapping points.
The groups may be chosen may by binning the data one dimension at a time. For example, the number of bins can be calculated by dividing the number of available pixels (px] on this axis by the characteristic dimension of the graphical symbol to be displayed (in the example mentioned above and as illustrated in figure 1, a circle of three-pixel radius, that is, 700 px / 3 px = -234 bins]. A bin size can then be calculated by dividing the range of the axis by the number of bins (e.g. if the axis is -10 to 10 then the range would be 20. Thus the bin size would be 20 / 234 = 0.0855]. The lower and upper bounds of each bin can then be calculated based on the size and number of bins [e.g. first bin would span from -10 to -9.9 145, the second from -9.9145 to -9.829, and so on). Each point is then put into a bin depending on its value for that axis.
Using this function we can then create the grouped points. Data is binned along a first axis as descr bed above. Then, for each bin returned that holds two or more points, data is binned along a second axis, but using the range and dimensions of that axis. Data are now grouped and should be packaged in a format that is understandable to by the application running on the client. If the grouped points are to be altered visually [opacity, size etc.) then they should have their size parameter attached to them as well.
It is often desirable when dealing with data visualizations to allow the user to zoom in to further explore the data that are represented. The grouping algorithm described above relies on the relationship between the size of each point, the dimensions of the graph in pixels and the range of the axes. Grouping of data points also results in a loss of detail when the sca'e of the display axes is increased. Therefore when a user wishes to zoom in on a graph, the grouping algorithm is run again, as both the range of the axes and the pixel spacing between each point will change.
A user may interact through a suitable interface to select an area of a graph that they wish to zoom in on. This may for example be by clicking and dragging to select a rectangular area.
When an area has been selected, the lower and upper bounds of the selection box are calculated. New axis ranges are passed back to a database where the raw data are stored, requesting a new subset of the data to be displayed. When retrieving the data needed from the database, only the data which lies within the viewable area is obtained. This reduced dataset is then grouped as described above using the new ranges and rendered on the display.
Labels for the axes are also updated to match the selected data.
The grouping algorithm described herein may be implemented in various different ways. In one embodiment, a single computer comprises a database with raw data and an application for rendering a graph on a display screen, and the grouping algorithm operates as part of the application for rendering data from the database. In another embodiment, a database with the raw data may be provided on a server and accessed by a user with a client computing device such as a personal computer) laptop or portable computing device such as a tablet computer or a cell phone. The grouping algorithm may be performed at the server side so that the load on the front-end user application which renders the graph is minimised. The server-side grouping algorithm and client-side rendering engine may suitably be provided as a web application, where the grouped data is served as HTML documents over TCP/IP or HTTP for viewing by an appropriate browsen When the raw data is hosted on a servei; it may be on a single server or may be distributed over several devices in a grid or cloud-based mannen When implemented as a web application, new axis ranges needed when zooming can be passed from the front-end to the server by an AJAX call or other suitable technique.
Figure 2 illustrates an example application of the disclosure for illustrative purposes. The left hand side shows an example appearance of a plot with ten thousand data points, while the right hand side shows an example appearance of a plot where the methods of the disclosure have been applied and a scatter plot is rendered based on a grouped data set with just two thousand data points. It can be seen that there is no appreciable difference between the two plots, although the one on the right hand side can be rendered much more quickly because the underlying set of data is greatly reduced as compared with the plot of the left hand side.
With the disclosure, high-density multidimensional data can be filtered cheaply on the backend of a web application before it is to be rendered onto a scatterplot by the client. This allows the load on the front-end of the application to be significantly reduced without visibly sacrificing information. This also reduces the data transmitted over an internet connection, making it easiet; faster and more reliable to serve high density data to mobile devices or over slower internet connections. The method disclosed is simple and efficient and can be processed quickly enough to enable zooming and similar functions at minimal computational expense.
Various improvements and modifications can be made to the above without departing from the scope of the disclosure. It will be appreciated that while the embodiments described above have referred primarily to optimising graphical data for display on a display device, the techniques can also be used for a graphical output device that comprises a printer; that is, data can be binned according to an available pixel area that is governed by a printers print resolution, reducing ink usage.
Claims (26)
- CLAIMS1. A method of displaying data comprising grouping the data to form a grouped representation of the data, and displaying said grouped representation; wherein said grouping is carried out to fit an available region of a graphical output device.
- 2. The method of claim 1, wherein said available region is an available pixel region.Alternatively, said available region is an available print region.
- 3. The method of claim 1 or claim 2, wherein grouping the data comprises aggregating the data by data binning.
- 4. The method of any preceding claim, wherein the graphical output device is a display screen.
- 5. The method of any preceding claim, wherein the binned data is output as a scatter plot with each datum represented by a graphical symbol.
- 6. The method of any preceding claim, wherein a first axis of data to be plotted is binned; and then a second axis of data to be plotted is binned for each bin of the first axis that contains two or more data.
- 7. The method of claim 6, wherein binning the data along an axis comprises: calculating a number of bins by dividing a number of output device pixels by a characteristic pixel dimension of a graphical symbol used to represent each datum; determining lower and upper bounds for each bin based on a range of an axis to be displayed and the number of bins; and allocating each datum to a bin depending on its value for that axis.
- 8. The method of any preceding claim, wherein the range of an axis to be displayed comprises a lower axis bounnd and an upper axis bound, as defined by the data.
- 9. The method of claim 8, wherein each bin has a size parameter associated with it that represents the number of data that the bin comprises.
- 10. The method of any preceding claim, wherein the grouped binned data are altered visually according to the size parameter, to represent the number of data that each bin comprises.
- 11. The method of claim 10, wherein said visual alteration comprises varying an opacity value of a grouped datum.
- 12. The method of any preceding claim, wherein a grouped representation is redrawn according to new boundaries in response to a zoom command.
- 13. A system for displaying data comprising: a database storing a plurality of data; a graphical output device comprising means to present data in a pixel region; a processor arranged to group the data to form a grouped representation of the data, and to provide commands to display said grouped representation; wherein said grouping is carried out to fit an available region of the graphical output device.
- 14. The system of claim 13, wherein said available region is an available pixel region.
- 15. The system of claim 13, wherein said available region is an available print region.
- 16. The system of any of claims 13 to 15, wherein the processor comprises a data binning component.
- 17. The system of any of claims 13 to 16, wherein the graphical output device is a display screen.
- 18. The system of claim 17, wherein the display screen is provided as a component part of a computing device.
- 19. The system of any of claims 13 to 18, wherein the system comprises a server comprising said database and said processoi; and said graphical output device is a client of the server or is provided as part of a client device of the server.
- 20. The system of claim 19, wherein the server is a web server and the client device runs a browser application for accessing the data.
- 21. A computer program product that includes instructions that when run on a computer, enable it to bin data to fit an available region of a graphical output device.
- 22. The product of claim 21, wherein said available region is an available pixel region.
- 23. The product of claim 21, wherein said available region is an available print region.
- 24. A computer program product that includes instructions that when run on a computer, enable it to request data from a database which is binned to fit an available region of a graphical output device associated with the computer running the instructions.
- 25. The product of claim 24, wherein said available region is an available pixel region.
- 26. The product of claim 24, wherein said available region is an available print region.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB1316207.8A GB2518171A (en) | 2013-09-11 | 2013-09-11 | Improvements in or relating to data processing |
US14/482,910 US10402727B2 (en) | 2013-09-11 | 2014-09-10 | Methods for evaluating and simulating data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB1316207.8A GB2518171A (en) | 2013-09-11 | 2013-09-11 | Improvements in or relating to data processing |
Publications (2)
Publication Number | Publication Date |
---|---|
GB201316207D0 GB201316207D0 (en) | 2013-10-23 |
GB2518171A true GB2518171A (en) | 2015-03-18 |
Family
ID=49487081
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
GB1316207.8A Withdrawn GB2518171A (en) | 2013-09-11 | 2013-09-11 | Improvements in or relating to data processing |
Country Status (1)
Country | Link |
---|---|
GB (1) | GB2518171A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3460646A4 (en) * | 2016-05-19 | 2019-04-24 | Sony Corporation | Information processing device, program, and information processing system |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10463445B2 (en) * | 2017-11-27 | 2019-11-05 | Biosense Webster (Israel) Ltd. | Point density illustration |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2006003484A2 (en) * | 2004-07-01 | 2006-01-12 | Spotfire Ab | Binning system for data analysis |
EP1717755A2 (en) * | 2005-03-08 | 2006-11-02 | Oculus Info Inc. | System and method for large scale information analysis using data visualization techniques |
US20110050702A1 (en) * | 2009-08-31 | 2011-03-03 | Microsoft Corporation | Contribution based chart scaling |
EP2485190A1 (en) * | 2011-02-04 | 2012-08-08 | Thomson Licensing | Adapting the resolution of a graphic representation of metadata |
-
2013
- 2013-09-11 GB GB1316207.8A patent/GB2518171A/en not_active Withdrawn
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2006003484A2 (en) * | 2004-07-01 | 2006-01-12 | Spotfire Ab | Binning system for data analysis |
EP1717755A2 (en) * | 2005-03-08 | 2006-11-02 | Oculus Info Inc. | System and method for large scale information analysis using data visualization techniques |
US20110050702A1 (en) * | 2009-08-31 | 2011-03-03 | Microsoft Corporation | Contribution based chart scaling |
EP2485190A1 (en) * | 2011-02-04 | 2012-08-08 | Thomson Licensing | Adapting the resolution of a graphic representation of metadata |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3460646A4 (en) * | 2016-05-19 | 2019-04-24 | Sony Corporation | Information processing device, program, and information processing system |
Also Published As
Publication number | Publication date |
---|---|
GB201316207D0 (en) | 2013-10-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7940271B2 (en) | System and method for large scale information analysis using data visualization techniques | |
Agrawal et al. | Challenges and opportunities with big data visualization | |
US20210232634A1 (en) | Quantified euler analysis | |
US8863034B2 (en) | 3D tag clouds for visualizing federated cross-system tags | |
US20160062585A1 (en) | Managing objects in panorama display to navigate spreadsheet | |
US20120102419A1 (en) | Representing data through a graphical object | |
TW201525838A (en) | Layer based reorganization of document components | |
US8432400B1 (en) | Transitional animation for generating tree maps | |
US20170221237A1 (en) | Data visualization system for exploring relational information | |
AU2015296903A1 (en) | Interface for accessing target data and displaying output to a user | |
US20230033541A1 (en) | Generating a visualization of data points returned in response to a query based on attributes of a display device and display screen to render the visualization | |
WO2015106214A2 (en) | Visually approximating parallel coordinates data | |
GB2518171A (en) | Improvements in or relating to data processing | |
US11687552B2 (en) | Multi-faceted visualization | |
US10289283B1 (en) | Visual analysis for multi-dimensional data | |
US20120167015A1 (en) | Providing visualization of system landscapes | |
US20120159376A1 (en) | Editing data records associated with static images | |
US9171387B2 (en) | Data visualization system | |
US8488183B2 (en) | Moving labels in graphical output to avoid overprinting | |
Graves | Techniques to reduce cluttering of rdf visualizations | |
Liao et al. | Application study of information visualization in digital library | |
US20150082235A1 (en) | Difference-oriented user interface creation | |
Wang et al. | The application of data cubes in business data visualization | |
CN117557682B (en) | Data processing method, apparatus, product, device, and medium | |
US10102652B2 (en) | Binning to prevent overplotting for data visualization |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WAP | Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1) |