US20180121680A1 - Obfuscating web code - Google Patents
Obfuscating web code Download PDFInfo
- Publication number
- US20180121680A1 US20180121680A1 US15/859,694 US201815859694A US2018121680A1 US 20180121680 A1 US20180121680 A1 US 20180121680A1 US 201815859694 A US201815859694 A US 201815859694A US 2018121680 A1 US2018121680 A1 US 2018121680A1
- Authority
- US
- United States
- Prior art keywords
- expressions
- code
- computer
- data
- replacement
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 230000014509 gene expression Effects 0.000 claims abstract description 172
- 238000000034 method Methods 0.000 claims abstract description 73
- 230000004044 response Effects 0.000 claims description 25
- 230000000694 effects Effects 0.000 description 28
- 238000004458 analytical method Methods 0.000 description 24
- 230000008569 process Effects 0.000 description 18
- 230000004048 modification Effects 0.000 description 16
- 238000012986 modification Methods 0.000 description 16
- 230000009471 action Effects 0.000 description 15
- 230000015654 memory Effects 0.000 description 12
- 230000006870 function Effects 0.000 description 10
- 230000003993 interaction Effects 0.000 description 10
- 238000004590 computer program Methods 0.000 description 9
- 238000010586 diagram Methods 0.000 description 9
- 238000012545 processing Methods 0.000 description 8
- 230000009466 transformation Effects 0.000 description 8
- 230000002159 abnormal effect Effects 0.000 description 6
- 238000013459 approach Methods 0.000 description 6
- 238000004891 communication Methods 0.000 description 5
- 235000014510 cooky Nutrition 0.000 description 5
- 230000008520 organization Effects 0.000 description 5
- 238000000844 transformation Methods 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 238000007726 management method Methods 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 238000012552 review Methods 0.000 description 3
- 238000013515 script Methods 0.000 description 3
- 230000001131 transforming effect Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000007790 scraping Methods 0.000 description 2
- 238000000926 separation method Methods 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 239000013589 supplement Substances 0.000 description 2
- 230000001502 supplementing effect Effects 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 238000006424 Flood reaction Methods 0.000 description 1
- 230000002547 anomalous effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000003292 diminished effect Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 239000000047 product Substances 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000009469 supplementation Effects 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/10—Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
- G06F21/12—Protecting executable software
- G06F21/121—Restricting unauthorised execution of programs
- G06F21/125—Restricting unauthorised execution of programs by manipulating the program code, e.g. source code, compiled code, interpreted code, machine code
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/64—Protecting data integrity, e.g. using checksums, certificates or signatures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/52—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity ; Preventing unwanted data erasure; Buffer overflow
- G06F21/54—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity ; Preventing unwanted data erasure; Buffer overflow by adding security routines or objects to programs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
Definitions
- This document relates to computer security and interference with malware.
- malware Malware
- Bot activities include content scraping, reconnaissance, credential stuffing, creating fake accounts, comment spamming, and similar activities. Bots can impose an unnecessary load on any company trying to serve web content efficiently. More importantly, they can attempt to “learn” the operation of a web site so as to exploit it.
- malicious software may execute a “man in the browser” attack by intercepting communications that a user makes with a web site in a manner that makes the user believe that he or she is actually communicating with the web site. For example, malware may generate a display for a user who is visiting a banking site, where the display requests from the user information such as social security number, credit card number, other account numbers. An organization that operates the malware may then have such data sent to it, and may use the data to steal from the user, the web site operator, or both.
- This document describes systems and techniques by which web code (e.g., HTML, CSS, and JavaScript) that a server system provides to client devices is modified before it is served over the internet, so as to make more difficult the exploitation of the code and the operator of the server system by clients that receive the code (including clients that are infected without their human users' knowledge).
- the modifications can be made to encode sensitive data, and may differ for different instances in which a web page and related content are served, whether to the same client computer or to different client computers.
- a single expression or value in the code may be re-written as multiple expressions that, when executed, produce the initial value or expression. Where different code is served in response to each request, the expressions into which the initial value are resolved may also differ each time.
- the output of the code, when executed on the client computer, however, is the same for all such different versions of the served code so that a user at a client computer does not perceive a difference in the displayed web page.
- two different users or a single user in two different web browsing sessions
- the manner in which an initial value or expression is rewritten into multiple expressions capable of being executed on a client computer may take a variety of forms For example, different expressions, different numbers of expressions, and different ordering of the execution of the expressions may all be varied to interfere with malware. Also, these different parameters may be varied so as to be different from one serving of the code to the next. Such variation, which may be termed “polymorphism” of the code, may help create a moving target against which malware needs to apply itself.
- changing the code that is served to client devices in an essentially random manner i.e., a manner that effectively interferes with the ability of malware that has analyzed serving n from inferring something useful about serving n+x
- each time the code is served can deter malicious code executing on the client computers (e.g., Man in the Browser bot) from interacting with the served code in a predictable way so as to trick a user of the client computer into providing confidential financial information and the like.
- external programs generally cannot drive web application functionality directly, and so preventing predictable interaction with served code can be an effective mechanism for preventing malicious computer activity.
- the techniques transform values or expressions, such as a cleartext string, a Javascript object, or a Javascript code snippet into another Javascript snippet that is the equivalent to the input after it is executed (i.e., it produces an identical displayed output).
- the encoding is dynamic and random, which means that the encoding generates different output code each time given the same input (though the outputs may repeat periodically as long as that repetition is not frequent enough to allow malware to predict the output or readily obtain the repeated output). Because the encoded output code is still presented as cleartext, it may not be able to prevent a human from ascertaining sensitive data, but it may make it very difficult for a malicious party to write a computer program to extract the sensitive data automatically.
- Some of these attacks include: (a) denial of service attacks, and particularly advanced application denial of service attacks, in which a malicious party targets a particular functionality of a website (e.g., a widget or other web application) and floods the server with requests for that functionality until the server can no longer respond to requests from legitimate users; (b) rating manipulation schemes in which fraudulent parties use automated scripts to generate a large number of positive or negative reviews of some entity such as a marketed product or business in order to artificially skew the average rating for the entity up or down; (c) fake account creation in which malicious parties use automated scripts to establish and use fake accounts on one or more web services to engage in attacks ranging from content spam, e-mail spam, identity theft, phishing, ratings manipulation, fraudulent reviews, and countless others; (d) fraudulent reservation of rival goods, by which a malicious party exploits flaws in a merchant's website to engage in a form of online scalp
- the systems, methods, and techniques for web code modifications described in this paper can, in certain implementations, prevent or deter one or more of these types of attacks. For example, transforming sensitive data by replacing expressions with a set of equivalent expressions and then interleaving the expressions in the set of equivalent expressions can cause the effectiveness of bots and other malicious automated scripts to be substantially diminished.
- the modification of code may be carried out by a security system that may supplement a web server system, and may intercept requests from client computers to the web server system and intercept responses from web servers of the system when they serve content back to the client computers (including where pieces of the content are served by different server systems).
- the modification may be of static code (e.g., HTML) and of related executable code (e.g., JavaScript) in combination.
- HTML static code
- JavaScript related executable code
- An expression may be rewritten as an equivalent expression or multiple expressions.
- the combination of the three expressions in the set of expressions produces the same result as the original expression—that is, an assignment of the value 2 to the variable y.
- Such rewriting, or transforming, of code may occur by first identifying data present in code that is to be served to the client computer (e.g., HTML, CSS, and JavaScript) and grouping such occurrences of sensitive data for further processing (e.g., by generating flags that point to each such element or copying a portion of each such element).
- the identified data may be identified as sensitive or potentially sensitive or simply data that should be rewritten before being served. Processing of the data may occur by modifying each element throughout different formats of code, such as changing an expression in the manner above each time that name occurs in a parameter, method call, DOM operation, or elsewhere. Next, further processing may occur that comprises interleaving the set of elements throughout the new code. Such a process may be repeated each time a client computer requests code, and the modifications may be different for each serving of the same code.
- the analysis to identify values or expressions that can be rewritten without affecting the operation of the code may be performed once, and a map to occurrences of such values or expressions in the mode may be generated, and then used for each serving of the code to locate the occurrences, so that they may be altered throughout the code in a consistent manner that does not break the code.
- Such analyze-once, transform-many approaches may lessen the computational load for such a system and allow greater scaling of the system to larger web server systems with high volume requirements.
- Such modification of the served code can help to prevent bots or other malicious code from exploiting or even detecting weaknesses in the web server system.
- the names of functions or variables may be changed in various random manners each time a server system serves the code.
- such constantly changing modifications may interfere with the ability of malicious parties to identify how the server system operates and web pages are structured, so that the malicious party cannot generate code to automatically exploit that structure in dishonest manners.
- Such techniques may create a moving target that can prevent malicious organizations from reverse-engineering the operation of a web site so as to build automated bots that can interact with the web site, and potentially carry out Man-in-the-Browser and other Man-in-the-Middle operations and attacks.
- the techniques discussed here may be carried out by a server subsystem that acts as an adjunct to a web server system that is commonly employed by a provider of web content.
- a server subsystem that acts as an adjunct to a web server system that is commonly employed by a provider of web content.
- an internet retailer may have an existing system by which it presents a web storefront at a web site (e.g., www.examplestore.com), interacts with customers to show them information about items available for purchase through the storefront, and processes order and payment information through that same storefront.
- the techniques discussed here may be carried out by the retailer adding a separate server subsystem (either physical or virtualized) that stands between the prior system and the internet.
- the new subsystem may act to receive web code from the web servers (or from a traffic management system that receives the code from the web servers), may translate that code in random manners before serving it to clients, may receive responses from clients and translate them in the opposite direction, and then provide that information to the web servers using the original names and other data.
- a system may provide the retailer or a third party with whom the retailer contracts (e.g., a web security company that monitors data from many different clients and helps them identify suspect or malicious activity) with information that identifies suspicious transactions.
- the security subsystem may keep a log of abnormal interactions, may refer particular interactions to a human administrator for later analysis or for real-time intervention, may cause a financial system to act as if a transaction occurred (so as to fool code operating on a client computer) but to stop such a transaction, or any number of other techniques that may be used to deal with attempted fraudulent transactions.
- a computer-implemented method includes identifying a piece of data for serving from a server system to a client device that is remote from the server system, the piece of data being part of executable code requested from the server from the client device; creating a plurality of expressions that, when executed, provide a result that corresponds to the piece of data; and providing, to the client device and as part of the executable code, the plurality of expressions along with code for executing the plurality of expressions, so that when the plurality of expressions are executed on the client device, the identified piece of data is returned on the client device without a need to serve the identified piece of data to the client device.
- the method can include performing a permutation on the plurality of expressions so that the plurality of expressions or ordered in the executable code in an order different than they were created. The order of the expressions can be selected randomly as part of the permutation.
- the method can include creating one or more additional expressions whose executed results are not used by other code that is part of the executable code served to the client device; and providing to the client device the plurality of expressions with the one or more additional expressions. Also, the method can include identifying, in the piece of data, data that needs to be kept away from malware that may be in the client device, and wherein creating a plurality of expressions comprises creating one or more replacement statements that when executed, provide a result that corresponds to the potentially sensitive data. The replacement statements can comprise one or more expressions that do not execute on the client device when the executable code is executed.
- the method can further include identifying, in the piece of data, a first expression and a second expression to be replaced, wherein creating a plurality of expressions comprises creating a first set of replacement expressions corresponding to the first expression and a second set of expressions corresponding to the second expressions; and interleaving the replacement expressions of the first set of replacement expressions with the replacement expressions of the second set of replacement expressions, wherein the plurality of expressions provided to the client device comprise the interleaved replacement expressions.
- creating a plurality of expressions comprises creating a first set of replacement expressions; identifying a first replacement expression in the first set of replacement expressions; creating a second set of replacement expressions that, when executed, provide a result that corresponds to the first replacement expression; and replacing the first replacement expression with the second set of replacement expressions.
- the piece of data to be served comprises formats of code in HTML, CSS, and JavaScript, and wherein each of the formats interoperates with the other formats.
- a computer-implemented method comprises receiving, from a server system, web content comprising original code, wherein the web content is requested by a client device that is remote from the server system; identifying a piece of data in the code; creating a plurality of expressions that, when executed, provide a result that corresponds to the piece of data; generating modified code comprising the original code with the piece of data replaced with the plurality of expressions; and providing the modified code to the client device, wherein, when executed, the modified code provides a result that corresponds to the original code.
- generating modified code comprises interleaving the plurality of expressions into the original code with the identified piece of data removed.
- the plurality of expressions is created in a first ordering, and the plurality of expressions is interleaved into the original code so that the plurality of expressions maintains the first ordering.
- the plurality of expressions are created in a first ordering, and the plurality of expressions are interleaved into the original code so that the plurality of expressions are in a second ordering that is different than the first ordering.
- the plurality of expressions includes one or more junk expressions that do not execute.
- the method further comprises selecting a first expression among the plurality of expressions; and creating a second plurality of expressions that, when executed, provide a result that corresponds to the selected first expression, wherein the generated modified code comprises the original code with the piece of data replaced with the plurality of expressions, with the selected first expression replaced with the second plurality of expressions.
- a computer system for recoding web content served to client computers comprises an interface for receiving information from a web server system configured to provide computer code in multiple different formats in response to requests from client computing devices; and a security intermediary that is arranged to (i) receive the computer code from the interface before the computer code is provided to the client computing devices, (ii) identify a piece of data in the computer code that is to be replaced; (iii) create a plurality of expressions that, when executed, provide a result that corresponds to the piece of data; and (iv) provide the plurality of expressions to the client computing devices with code for executing the plurality of expressions.
- the piece of data in the computer code that is to be replaced is identified as potentially sensitive data.
- the security intermediary is further arranged to perform a permutation of the plurality of expressions.
- the plurality of expressions comprise one or more expressions that do not execute.
- the security intermediary is further arranged to interleave the plurality of expressions with the code of executing the plurality of expressions.
- FIG. 1 is a conceptual diagram showing transformation of a value into multiple expressions that execute back to the value.
- FIG. 1A depicts a general overview of a system for requesting, modifying, and serving web content.
- FIG. 1B depicts a schematic diagram of an encoding system that modifies requested web content.
- FIG. 2 depicts an overview of a method for modifying program code.
- FIG. 3A-3G depict various examples for modifying code for web content.
- FIG. 4 is a flow diagram of a process for serving modified, or encoded, web content.
- FIG. 5 shows a system for serving polymorphic code.
- FIG. 6 is a schematic diagram of a general computing system.
- FIG. 1 is a conceptual diagram showing transformation of a value into multiple expressions that execute back to the value.
- the diagram attempts to show at a high level how initial representations in code can be rewritten as multiple representations that together can be executed on a client device to return the initial representation.
- the multiple representations can be difficult for automated malware to analyze because they can not easily be matched to a template, can be scattered throughout the code in appropriate circumstances, and can be constantly changed, both in their values and in their ordered and locations in the code.
- the diagram depicts a process, flowing from left-to-right.
- the process starts with a value 102 , which may take a variety of forms.
- the value may be a simple string or number in plaintext form.
- Such value may be found by analysis of web code served by a web server system and provided to an intermediate security system that is tasked with recoding portions of the served code where the recoding will not affect the functionality of the code when it is executed on client devices.
- the intermediate security system identifies a relatively complex expression that will resolve to the value.
- the expression is shown here in the form of a pseudo-equation.
- operations are shown as a box surrounding a dot, to represent that any appropriate operation may be used.
- Parentheses are used to indicate grouping of operations, and the ability to have the relative groups combined with each other out of the order they are shown in the equation.
- the three main groups are each converted into code snippets to represent the relevant sub-expressions, and then the order in which those sub-expressions are evaluated is changed—where the second grouping from the formula is evaluated first in the code, then the first, and then the third. Additional code may be generated to evaluate the results of the three groupings together with each other.
- the code generated at 106 may then be inserted into the code received from the web server system and may be served to a client device that requested the code.
- that code is executed at the client device, such as using a web browser, and such execution generates the initial value 108 or a value that is equivalent to the initial value.
- the value “T” may be resolved into very different lines of code and expressions.
- the process shown here is able to replace original code with different code that serves as a proxy for the original code, and that reaches the same result as the original code when it is executed by the standard environment (e.g., standard JavaScript run-time) on a client device.
- the standard environment e.g., standard JavaScript run-time
- FIG. 1A depicts an overview of a system 100 for encoding web content served from a web server 122 to a polymorphic encoding system 124 (or simply, encoding system) and to a web browser 126 .
- the system 100 represents a high-level depiction of the system in FIG. 1 .
- the polymorphic encoding system 124 receives web content from a web server 122 that is to be served to a web browser 126 at, for example, a client device. Prior to serving the web content to the web browser 126 , the polymorphic encoding system 124 identifies and encodes potentially sensitive data. Web content that is handled by the system 100 may include, for example, HTML, CSS, JavaScript, and other program code associated with the content or transmission of web resources such as a web page that may be presented at a client computer (or many different requesting client computers).
- FIG. 1B depicts various parts of the encoding system 124 of FIG. 1A .
- these components operate to transform incoming computer code so as to convert values or expressions into multiple additional expressions that resolve to the original values or expressions when they are executed as part of the code.
- a sensitive data identifier 110 parses code to identify sensitive data or potentially sensitive data, including data that can be recoded without affecting the functionality of the code when it is executed.
- data identifier 110 may broadly identify data that is to be replaced, regardless of whether the data is identified as sensitive in nature.
- program code P may comprise statements S 1 , S 2 , S 3 , and S 4 .
- the sensitive data identifier 110 identifies statement S 1 as potentially sensitive data.
- Various methods may be used for identifying potentially sensitive data. For example, data associated with a form to be filled out or with particular fields or fieldnames in a form may be identified. Also, an operator of a security system may study the code served by a particular organization and may flag particular fields or other elements that are frequently served by the organization and are of a sensitive nature. The sensitive field identifier 110 may then use a list of fields or other information generated by such an analysis to locate sensitive fields in other pages of web code to be served by the same organization.
- a replacement code generator 112 generates code that replaces such potentially sensitive data.
- the generated code when executed, generates the same output as the originally-identified potentially sensitive code.
- replacement code generator 112 generates four statements E 1 , E 2 , E 3 , and E 4 that, when executed, produce the same output as statement S 1 .
- Interleaver 114 takes the replacement code statements E 1 , E 2 , E 3 , and E 4 , and interleaves the replacement code statements into other programmatic statements that are already part of program P, or statements that have been generated as replacement code for other statements in the code.
- the interleaving process may be random (though avoiding any placement that would break the code) and may result in a different ordering of statements in response to two different requests.
- the resulting program with the interleaved statements when executed, produces the same functional output as program P.
- the data transferred from the encoding system 124 to the web browser 126 may be, for example, in the form of obfuscated JavaScript code with the sensitive data hidden within the code. Specific example methods for encoding the sensitive data are described below with respect to FIGS. 3A-3G .
- the encoding system 124 identifies and extracts the sensitive data and then modifies the code. The modified code is then incorporated in the original web content, replacing the sensitive data.
- FIG. 2 depicts an example of how program code P 202 may be modified, or encoded, into program code P′ 206 .
- the illustrated process involves identifying a number of operations or statements that may be joined together and transformed into code that, when executed under a standard programming environment (e.g., a standard run-time implementation), will produce an original starting value or expression.
- a standard programming environment e.g., a standard run-time implementation
- Program P 202 represents any appropriate web content, such as HTML, CSS, JavaScript, and other program code.
- Program P 202 comprises a set of n statements, ⁇ S 1 , S 2 , S 3 , . . . , Sn ⁇ .
- the statement Si in the set of statements may be potentially sensitive data or content that is confirmed to be sensitive in nature.
- the statement Si in the set of statements may include statements that are identified as needing to be replaced.
- Each statement, Si may be a line of code or expression in the program.
- each of the statements, Si is rewritten as a set of statements ⁇ Si 1 , Si 2 , Si 3 , Si 4 . . . ⁇ that, collectively, is executed as the equivalent of the individual statement Si, as described in further detail below with respect to FIGS. 3A-3F .
- a set of equivalent statements E 204 for Program P 202 is generated. That is, the set of equivalent statements E 204 comprises n sets of equivalent statements for each statement Si, in Program P 202 .
- the statement Si is replaced with the set of equivalent statements ⁇ S 11 , S 12 , S 13 , S 14 . . . ⁇ .
- the number of statements in each set of equivalent statements need not be the same.
- the various statements in the set of equivalent statements E 204 are interleaved, as described below with respect to FIG. 3G .
- FIGS. 3A-3F show examples of equivalent statement replacement.
- the figures show manners in which a single line of code for expressing a variable-assigning and/or mathematical relationship can be expressed instead by multiple lines of code that can be executed in a particular order to reach the initial result.
- numeric functions are shown in the examples, here, other data may be similarly treated.
- an alphanumeric string may be transformed through multiple operations, so that the starting point is a string different than what the web server system provided, but that ends up in generating the same string when the code is executed by a web browser.
- a constant number in Javascript can be replaced by an equivalent Javascript expression.
- the number 1 can be written as (3 ⁇ 2) or as (0+1), and the number 24 can be written as the expression (4*6) or (30 ⁇ 6).
- the system can first use a random generator to generate a random number, for example, x, and then replace the number y with (x ⁇ (x ⁇ y)), which appears in the code to be different than y but is functional equivalent of y when executed.
- FIG. 3A shows a sum (or subtraction) operation
- any appropriate type of JavaScript operation e.g., sum, multiply, divide
- function that returns a number e.g., String.length( ), Array.length( )
- Other types of constants such as Boolean, string, array, and object, can be replaced with equivalent statements using a similar approach.
- FIG. 3B shows an example of an equivalent statement replacement of a Boolean operation, as one such example.
- the set of equivalent statements comprises a set of three statements, which, when executed, equivalently result in the variable y being set as “true.”
- the first two statements assign the values 3 and 2 to variables a and b, respectively. In doing so, the value of a is assigned to be greater than the value of b.
- FIG. 3C shows an example of an equivalent statement replacement of a string constant.
- the variable y is set as “abc”.
- a string or the characters in a string may be assigned numeric values, and the operations performed on the numeric examples in the figures above and below may be applied, and then the resulting numeric value or values may be converted back into alphanumeric characters for a string.
- the letter “a” may be assigned a value of 256 in a particular font definition, and the techniques discussed here may be used to break the value of 256 up into a plurality of expressions.
- the number 256 may be returned, and then may be rendered as a glyph for the character set, as an “a.”
- Another method of equivalent statement replacement involves adding junk code or junk branches, and may be applied as an alternative or additionally to the other examples discussed here.
- the purpose of adding the junk code is to add “noise” to the code so that potential hackers or attackers cannot use the position of expressions in the code (e.g, line number or the nth statement) to locate a key function or variable.
- variables j and k not be present anywhere else in the program code, so that while the new code causes j to be assigned the value of 4 and k to be assigned the value of 7, variables j and k are not used anywhere else in the code and do not otherwise affect the operation or execution of the program code.
- junk branches can be generated to add a level of obfuscation to the code.
- FIG. 3E shows an example of permutations of statements for a scenario in which equivalent statements do not require a strict order.
- N such statements for which order does not matter, there are N! possible permutations.
- Permutation of statements may be used on an array, list string, or other collection data structure.
- FIGS. 3A-3F illustrated various methods of producing equivalent statements 204 .
- an encoded program P′ 206 can be generated by interleaving the equivalent statements E 204 with the rest of the code, as shown, for example, in FIG. 3G (and potentially identifying which of the statements can be included in either order relative to other of the statements).
- a set of replacement statements 376 are generated for statement 372 and a set of replacement statements 378 are generated for statement 374 .
- FIG. 3G shows an example where the order of the statements in each of the sets of replacement statements 376 , 378 matter (i.e., affect the outcome of the executed code), but the order of the statements between the two sets of replacement statements 376 , 378 does not matter.
- FIG. 4 is a flow diagram of a process for serving modified program code.
- the process involves identifying items in content to be served to a client computer that may potentially include sensitive data, transforming the data dynamically and randomly into a set of other data, and incorporating the set of other data into the content in a manner so as to hide the potentially sensitive data.
- a request for web content is received, such as from a client computer operated by an individual seeking to perform a banking transaction at a website for the individual's bank.
- the request may be in the form of an HTTP request and may be received by a load balancer operated by, or for, the bank.
- the load balancer may recognize the form of the request and understand that it is to be handled by a security system that the bank has installed to operate along with its web server system.
- the load balancer may thus provide the request to the security system, which may forward it to the web server system after analyzing the request (e.g., to open a tracking session based on the request), or may provide the request to the web server system and also provide information about the request to the security system in parallel.
- a response to the request is generated by the web server system.
- the user may have requested to perform a funds transfer between accounts at the bank, where the funds are owned by the individual, and the response by the web server system may include HTML for a webpage on which the user can specify parameters for the transaction, along with JavaScript code and CSS code for carrying out such transactions at a web browser operated by the individual.
- the web server system sends the response to the request to an encoding system.
- the response may comprise the web content requested by the client computer. Included in the response may be potentially sensitive data, such as, for example, or account numbers, routing numbers, or other data relating to a banking transaction.
- the encoding system receives the web content from the web server system and identifies potentially sensitive data in the web content.
- the encoding system generates code to replace the sensitive data.
- the sensitive data may be written as a set of replacement statements, which, when executed, are displayed the same as the sensitive data, resulting in no difference in appearance to a user requesting the web content.
- Various methods for rewriting or replacing the sensitive data are possible, including the methods described above with respect to FIGS. 3A-3F .
- the replacement of sensitive data may include replacing a single statement or expression in the web content, or it may include replacing numerous statements. At a minimum, however, a single statement, or expression, is replaced with a set of equivalent statements.
- the set equivalent statements may comprise one or more statements, which, when collectively executed, output the same result as the initial statement comprising sensitive data.
- the encoding system may identify a single statement assigning a constant value to contain sensitive data.
- the encoding system may randomly generate a set of equivalent statements, which, collectively, make the same assignment, as illustrated, for example, in FIG. 3A .
- the encoding system may identify a single statement containing sensitive data.
- the encoding system may add one or more lines of junk code or junk branches. The purpose of the junk code is to add a layer of randomness to the code to prevent potential hackers from using the position (e.g., line number) of code to identify potentially sensitive data. When executed, the junk code has no visible effect on the displayed web content. Similarly, the encoding system may generate junk branches that appear to supplement the original statement or expression in the code.
- the junk branches may comprise conditional statements or expressions that will never execute.
- An example of generating a junk branch is discussed above with respect to FIG. 3D .
- Adding junk branches adds an additional level of obfuscation to the code that hampers a potential attacker's ability to target sensitive data.
- the encoding system may employ recursive coding, generating multiple “layers” of replacement code.
- An example of recursive coding is shown, for example, in FIG. 3F .
- an assignment statement is replaced with three separate statements.
- a junk branch is added to one of the three replacement statements. While the example shown in FIG. 3F shows only two “layers” of replacement code, any number of “layers” of replacement code may be generated.
- the method moves to box 412 where the various replacement statements are interleaved in the code of the web content.
- An example of the interleaving process is described above with respect to FIG. 3G .
- Each of the two sets of replacement code 376 and 378 has a particular ordering of the statements. In some instances, the ordering of the individual statements affects the execution of the code, while in other instances, the ordering does not affect the outcome.
- the encoding system randomly and dynamically generates code to replace the sensitive data. That is, given the same input code (i.e., web content), the encoding system does not necessarily generate the same replacement code in response to two different requests for the web content. Furthermore, the approach for generating the replacement code may be different in response to two requests for the same web content. For example, in response to one request, the encoder system may replace a first statement with a set of three replacement statements that collectively result in the same result as the first statement, such as the example shown in FIG. 3A . In response to a second request for the same web content, the encoder system may replace the same first statement with a set of three different replacement statements that include a junk branch, such as the example shown in FIG. 3D .
- the process then serves the recoded web content at box 414 , in familiar manners.
- Such a process may be performed repeatedly each time a client computer requests content, with the recoded content being different each time the content is served through the encoding system, including when identical or nearly identical content is requested in separate transactions by two different users or by the same user.
- the code that is served by the encoding system may be supplemented with instrumentation code that runs on the computer browser and monitors interaction with the web page.
- the instrumentation code may look for particular method calls or other calls to be made, such as when the calls or actions relate to a field in a form that is deemed to be subject to malicious activity, such as a client ID number field, a transaction account number field, or a transaction amount field.
- the instrumentation code observes such activity on the client device, it will report that activity along with metadata that helps to characterize the activity, the process receives such reports from the instrumentation code and processes them, such as by forwarding them to a central security system that may analyze them to determine whether such activity is benign or malicious.
- FIG. 5 shows a system 500 for serving polymorphic and instrumented code.
- polymorphic code is code that is changed in different manners for different servings of the code, in manners that do not affect the way in which the executed code is perceived by users. The goal is to create a moving target for malware that tries to determine how the code operates, but without changing the user experience.
- Instrumented code is code that is served, e.g., to a browser, with the main functional code and monitors how the functional code operates on a client device, and how other code may interact with the functional code and other activities on the client device.
- the system 500 may identify values or expressions in the code that can be replaced with multiple other expressions that, when executed on a client device, resolve to the initial value or expressions.
- the system 500 may be adapted to perform deflection and detection of malicious activity with respect to a web server system. Deflection may occur, for example, by the serving of polymorphic code, which interferes with the ability of malware to interact effectively with the code that is served. Detection may occur, for example, by adding instrumentation code (including injected code for a security service provider) that monitors activity of client devices that are served web code.
- instrumentation code including injected code for a security service provider
- the system 500 in this example is a system that is operated by or for a large number of different businesses that serve web pages and other content over the internet, such as banks and retailers that have on-line presences (e.g., on-line stores, or on-line account management tools).
- the main server systems operated by those organizations or their agents are designated as web servers 504 a - 504 n , and could include a broad array of web servers, content servers, database servers, financial servers, load balancers, and other necessary components (either as physical or virtual servers).
- security server systems 502 a to 502 n may cause code from the web server system to be supplemented and altered.
- code may be provided, either by the web server system itself as part of the originally-served code, or by another mechanism after the code is initially served, such as by the security server systems 502 a to 502 n , where the supplementing code causes client devices to which the code is served to transmit data that characterizes the client devices and the use of the client devices.
- other actions may be taken by the supplementing code, such as the code reporting actual malware activity or other anomalous activity at the client devices that can then be analyzed to determine whether the activity is malware activity.
- the set of security server systems 502 a to 502 n is shown connected between the web servers 504 a to 504 n and a network 510 such as the internet. Although both extend to n in number, the actual number of sub-systems could vary. For example, certain of the customers could install two separate security server systems to serve all of their web server systems (which could be one or more), such as for redundancy purposes.
- the particular security server systems 502 a - 502 n may be matched to particular ones of the web server systems 504 a - 504 n , or they may be at separate sites, and all of the web servers for various different customers may be provided with services by a single common set of security servers 502 a - 502 n (e.g., when all of the server systems are at a single co-location facility so that bandwidth issues are minimized).
- Each of the security server systems 502 a - 502 n may be arranged and programmed to carry out operations like those discussed above and below and other operations.
- a policy engine 520 in each such security server system may evaluate HTTP requests from client computers (e.g., desktop, laptop, tablet, and smartphone computers) based on header and network information, and can set and store session information related to a relevant policy.
- the policy engine may be programmed to classify requests and correlate them to particular actions to be taken to code returned by the web server systems before such code is served back to a client computer.
- the policy information may be provided to a decode, analysis, and re-encode module, which matches the content to be delivered, across multiple content types (e.g., HTML, JavaScript, and CSS), to actions to be taken on the content (e.g., using XPATH within a DOM), such as substitutions, addition of content, and other actions that may be provided as extensions to the system.
- content types e.g., HTML, JavaScript, and CSS
- actions to be taken on the content e.g., using XPATH within a DOM
- substitutions e.g., addition of content, and other actions that may be provided as extensions to the system.
- the different types of content may be analyzed to determine naming that may extend across such different pieces of content (e.g., the name of a function or parameter), and such names may be changed in a way that differs each time the content is served, e.g., by replacing a named item with randomly-generated characters.
- Elements within the different types of content may also first be grouped as having a common effect on the operation of the code (e.g., if one element makes a call to another), and then may be re-encoded together in a common manner so that their interoperation with each other will be consistent even after the re-encoding.
- Both the analysis of content for determining which transformations to apply to the content, and the transformation of the content itself, may occur at the same time (after receiving a request for the content) or at different times.
- the analysis may be triggered, not by a request for the content, but by a separate determination that the content newly exists or has been changed. Such a determination may be via a “push” from the web server system reporting that it has implemented new or updated content.
- the determination may also be a “pull” from the security servers 502 a - 502 n , such as by the security servers 502 a - 502 n implementing a web crawler (not shown) to recursively search for new and changed content and to report such occurrences to the security servers 502 a - 502 n , and perhaps return the content itself and perhaps perform some processing on the content (e.g., indexing it or otherwise identifying common terms throughout the content, creating DOMs for it, etc.).
- the analysis to identify portions of the content that should be subjected to polymorphic modifications each time the content is served may then be performed according to the manner discussed above and below.
- a rules engine 522 may store analytical rules for performing such analysis and for re-encoding of the content.
- the rules engine 522 may be populated with rules developed through operator observation of particular content types, such as by operators of a system studying typical web pages that call JavaScript content and recognizing that a particular method is frequently used in a particular manner. Such observation may result in the rules engine 522 being programmed to identify the method and calls to the method so that they can all be grouped and re-encoded in a consistent and coordinated manner.
- the decode, analysis, and re-encode module 524 encodes content being passed to client computers from a web server according to relevant policies and rules.
- the module 524 also reverse encodes requests from the client computers to the relevant web server or servers.
- a web page may be served with a particular parameter, and may refer to JavaScript that references that same parameter.
- the decode, analysis, and re-encode module 524 may replace the name of that parameter, in each of the different types of content, with a randomly generated name, and each time the web page is served (or at least in varying sessions), the generated name may be different.
- the name of the parameter is passed back to the web server, it may be re-encoded back to its original name so that this portion of the security process may occur seamlessly for the web server.
- a key for the function that encodes and decodes such strings can be maintained by the security server system 502 along with an identifier for the particular client computer so that the system 502 may know which key or function to apply, and may otherwise maintain a state for the client computer and its session.
- a stateless approach may also be employed, whereby the system 502 encrypts the state and stores it in a cookie that is saved at the relevant client computer. The client computer may then pass that cookie data back when it passes the information that needs to be decoded back to its original status. With the cookie data, the system 502 may use a private key to decrypt the state information and use that state information in real-time to decode the information from the client computer.
- Such a stateless implementation may create benefits such as less management overhead for the server system 502 (e.g., for tracking state, for storing state, and for performing clean-up of stored state information as sessions time out or otherwise end) and as a result, higher overall throughput.
- the decode, analysis, and re-encode module 524 and the security server system 502 may be configured to modify web code differently each time it is served in a manner that is generally imperceptible to a user who interacts with such web code.
- multiple different client computers may request a common web resource such as a web page or web application that a web server provides in response to the multiple requests in substantially the same manner.
- a common web page may be requested from a web server, and the web server may respond by serving the same or substantially identical HTML, CSS, JavaScript, images, and other web code or files to each of the clients in satisfaction of the requests.
- particular portions of requested web resources may be common among multiple requests, while other portions may be client or session specific.
- the decode, analysis, and re-encode module 524 may be adapted to apply different modifications to each instance of a common web resource, or common portion of a web resource, such that the web code that it is ultimately delivered to the client computers in response to each request for the common web resource includes different modifications.
- the analysis can happen a single time for a plurality of servings of the code in different recoded instances. For example, the analysis may identify a particular function name and all of the locations it occurs throughout the relevant code, and may create a map to each such occurrence in the code. Subsequently, when the web content is called to be served, the map can be consulted and random strings may be inserted in a coordinated matter across the code, though the generation of a new name each time for the function name and the replacement of that name into the code, will require much less computing cost than would full re-analysis of the content. Also, when a page is to be served, it can be analyzed to determine which portions, if any, have changed since the last analysis, and subsequent analysis may be performed only on the portions of the code that have changed.
- the security server system 502 can apply the modifications in a manner that does not substantially affect a way that the user interacts with the resource, regardless of the different transformations applied. For example, when two different client computers request a common web page, the security server system 502 applies different modifications to the web code corresponding to the web page in response to each request for the web page, but the modifications do not substantially affect a presentation of the web page between the two different client computers. The modifications can therefore be made largely transparent to users interacting with a common web resource so that the modifications do not cause a substantial difference in the way the resource is displayed or the way the user interacts with the resource on different client devices or in different sessions in which the resource is requested.
- An instrumentation module 526 is programmed to add instrumentation code to the content that is served from a web server.
- the instrumentation code is code that is programmed to monitor the operation of other code that is served. For example, the instrumentation code may be programmed to identify when certain methods are called, when those methods have been identified as likely to be called by malicious software. When such actions are observed to occur by the instrumentation code, the instrumentation code may be programmed to send a communication to the security server reporting on the type of action that occurred and other meta data that is helpful in characterizing the activity. Such information can be used to help determine whether the action was malicious or benign.
- the instrumentation code may also analyze the DOM on a client computer in predetermined manners that are likely to identify the presence of and operation of malicious software, and to report to the security servers 502 or a related system.
- the instrumentation code may be programmed to characterize a portion of the DOM when a user takes a particular action, such as clicking on a particular on-page button, so as to identify a change in the DOM before and after the click (where the click is expected to cause a particular change to the DOM if there is benign code operating with respect to the click, as opposed to malicious code operating with respect to the click).
- Data that characterizes the DOM may also be hashed, either at the client computer or the server system 502 , to produce a representation of the DOM (e.g., in the differences between part of the DOM before and after a defined action occurs) that is easy to compare against corresponding representations of DOMs from other client computers.
- Other techniques may also be used by the instrumentation code to generate a compact representation of the DOM or other structure expected to be affected by malicious code in an identifiable manner.
- Uninfected client computers 513 A- 512 n represent computers that do not have malicious code programmed to interfere with a particular site a user visits or to otherwise perform malicious activity.
- Infected client computers 514 a - 514 n represent computers that do have malware or malicious code ( 518 a - 518 n , respectively) programmed to interfere with a particular site a user visits or to otherwise perform malicious activity.
- the client computers 513 A- 512 n , 514 a - 514 n may also store the encrypted cookies discussed above and pass such cookies back through the network 510 .
- the client computers 512 A- 512 n , 514 a - 514 n will, once they obtain the served content, implement DOMs for managing the displayed web pages, and instrumentation code may monitor the respective DOMs as discussed above. Reports of illogical activity (e.g., software on the client device calling a method that does not exist in the downloaded and rendered content) can then be reported back to the server system.
- each web site operator may be provided with a single security console 507 that provides analytical tools for a single site or group of sites.
- the console 507 may include software for showing groups of abnormal activities, or reports that indicate the type of code served by the web site that generates the most abnormal activity.
- a security officer for a bank may determine that defensive actions are needed if most of the reported abnormal activity for its web site relates to content elements corresponding to money transfer operations-an indication that stale malicious code may be trying to access such elements surreptitiously.
- Console 507 may also be multiple different consoles used by different employees of an operator of the system 500 , and may be used for pre-analysis of web content before it is served, as part of determining how best to apply polymorphic transformations to the web code.
- an operator at console 507 may form or apply rules 522 that guide the transformation that is to be performed on the content when it is ultimately served.
- the rules may be written explicitly by the operator or may be provided by automatic analysis and approved by the operator.
- the operator may perform actions in a graphical user interface (e.g., by selecting particular elements from the code by highlighting them with a pointer, and then selecting an operation from a menu of operations) and rules may be written consistent with those actions.
- a central security console 508 may connect to a large number of web content providers, and may be run, for example, by an organization that provides the software for operating the security server systems 502 A- 502 n .
- Such console 508 may access complex analytical and data analysis tools, such as tools that identify clustering of abnormal activities across thousands of client computers and sessions, so that an operator of the console 508 can focus on those clusters in order to diagnose them as malicious or benign, and then take steps to thwart any malicious activity.
- the console 508 may have access to software for analyzing telemetry data received from a very large number of client computers that execute instrumentation code provided by the system 500 .
- Such data may result from forms being re-written across a large number of web pages and web sites to include content that collects system information such as browser version, installed plug-ins, screen resolution, window size and position, operating system, network information, and the like.
- user interaction with served content may be characterized by such code, such as the speed with which a user interacts with a page, the path of a pointer over the page, and the like.
- Such collected telemetry data may be used by the console 508 to identify what is “natural” interaction with a particular page that is likely the result of legitimate human actions, and what is “unnatural” interaction that is likely the result of a bot interacting with the content.
- Statistical and machine learning methods may be used to identify patterns in such telemetry data, and to resolve bot candidates to particular client computers.
- client computers may then be handled in special manners by the system 500 , may be blocked from interaction, or may have their operators notified that their computer is potentially running malicious software (e.g., by sending an e-mail to an account holder of a computer so that the malicious software cannot intercept it easily).
- FIG. 6 is a schematic diagram of a general computing system 600 .
- the system 600 can be used for the operations described in association with any of the computer-implement methods described previously, according to one implementation.
- the system 600 is intended to include various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers.
- the system 600 can also include mobile devices, such as personal digital assistants, cellular telephones, smartphones, and other similar computing devices.
- the system can include portable storage media, such as, Universal Serial Bus (USB) flash drives.
- USB flash drives may store operating systems and other applications.
- the USB flash drives can include input/output components, such as a wireless transmitter or USB connector that may be inserted into a USB port of another computing device.
- the system 600 includes a processor 610 , a memory 620 , a storage device 630 , and an input/output device 640 .
- Each of the components 610 , 620 , 630 , and 640 are interconnected using a system bus 650 .
- the processor 610 is capable of processing instructions for execution within the system 600 .
- the processor may be designed using any of a number of architectures.
- the processor 610 may be a CISC (Complex Instruction Set Computers) processor, a RISC (Reduced Instruction Set Computer) processor, or a MISC (Minimal Instruction Set Computer) processor.
- the processor 610 is a single-threaded processor. In another implementation, the processor 610 is a multi-threaded processor.
- the processor 610 is capable of processing instructions stored in the memory 620 or on the storage device 630 to display graphical information for a user interface on the input/output device 640 .
- the memory 620 stores information within the system 600 .
- the memory 620 is a computer-readable medium.
- the memory 620 is a volatile memory unit.
- the memory 620 is a non-volatile memory unit.
- the storage device 630 is capable of providing mass storage for the system 600 .
- the storage device 630 is a computer-readable medium.
- the storage device 630 may be a floppy disk device, a hard disk device, an optical disk device, or a tape device.
- the input/output device 640 provides input/output operations for the system 600 .
- the input/output device 640 includes a keyboard and/or pointing device.
- the input/output device 640 includes a display unit for displaying graphical user interfaces.
- the features described can be implemented in digital electronic circuitry, or in computer hardware, firmware, software, or in combinations of them.
- the apparatus can be implemented in a computer program product tangibly embodied in an information carrier, e.g., in a machine-readable storage device for execution by a programmable processor; and method steps can be performed by a programmable processor executing a program of instructions to perform functions of the described implementations by operating on input data and generating output.
- the described features can be implemented advantageously in one or more computer programs that are executable on a programmable system including at least one programmable processor coupled to receive data and instructions from, and to transmit data and instructions to, a data storage system, at least one input device, and at least one output device.
- a computer program is a set of instructions that can be used, directly or indirectly, in a computer to perform a certain activity or bring about a certain result.
- a computer program can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
- Suitable processors for the execution of a program of instructions include, by way of example, both general and special purpose microprocessors, and the sole processor or one of multiple processors of any kind of computer.
- a processor will receive instructions and data from a read-only memory or a random access memory or both.
- the essential elements of a computer are a processor for executing instructions and one or more memories for storing instructions and data.
- a computer will also include, or be operatively coupled to communicate with, one or more mass storage devices for storing data files; such devices include magnetic disks, such as internal hard disks and removable disks; magneto-optical disks; and optical disks.
- Storage devices suitable for tangibly embodying computer program instructions and data include all forms of non-volatile memory, including by way of example semiconductor memory devices, such as EPROM, EEPROM, and flash memory devices; magnetic disks such as internal hard disks and removable disks; magneto-optical disks; and CD-ROM and DVD-ROM disks.
- semiconductor memory devices such as EPROM, EEPROM, and flash memory devices
- magnetic disks such as internal hard disks and removable disks
- magneto-optical disks and CD-ROM and DVD-ROM disks.
- the processor and the memory can be supplemented by, or incorporated in, ASICs (application-specific integrated circuits).
- ASICs application-specific integrated circuits
- the features can be implemented on a computer having a display device such as a CRT (cathode ray tube) or LCD (liquid crystal display) monitor for displaying information to the user and a keyboard and a pointing device such as a mouse or a trackball by which the user can provide input to the computer. Additionally, such activities can be implemented via touchscreen flat-panel displays and other appropriate mechanisms.
- a display device such as a CRT (cathode ray tube) or LCD (liquid crystal display) monitor for displaying information to the user and a keyboard and a pointing device such as a mouse or a trackball by which the user can provide input to the computer.
- a keyboard and a pointing device such as a mouse or a trackball by which the user can provide input to the computer.
- activities can be implemented via touchscreen flat-panel displays and other appropriate mechanisms.
- the features can be implemented in a computer system that includes a back-end component, such as a data server, or that includes a middleware component, such as an application server or an Internet server, or that includes a front-end component, such as a client computer having a graphical user interface or an Internet browser, or any combination of them.
- the components of the system can be connected by any form or medium of digital data communication such as a communication network. Examples of communication networks include a local area network (“LAN”), a wide area network (“WAN”), peer-to-peer networks (having ad-hoc or static members), grid computing infrastructures, and the Internet.
- LAN local area network
- WAN wide area network
- peer-to-peer networks having ad-hoc or static members
- grid computing infrastructures and the Internet.
- the computer system can include clients and servers.
- a client and server are generally remote from each other and typically interact through a network, such as the described one.
- the relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
- the subject matter may be embodied as methods, systems, devices, and/or as an article or computer program product.
- the article or computer program product may comprise one or more computer-readable media or computer-readable storage devices, which may be tangible and non-transitory, that include instructions that may be executable by one or more machines such as computer processors.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Bioethics (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Technology Law (AREA)
- Information Transfer Between Computers (AREA)
Abstract
Description
- This application claims the benefit under 35 U.S.C. 120 as a Continuation of U.S. patent application Ser. No. 14/286,324, filed on 2014 May 23, the entire contents of which are hereby incorporated by reference for all purposes as if fully set forth herein. The applicant(s) hereby rescind any disclaimer of claim scope in the parent application(s) or the prosecution history thereof and advise the USPTO that the claims in this application may be broader than any claim in the parent application(s).
- This document relates to computer security and interference with malware.
- Research indicates that a large share of web traffic involves computer bots—many are malware. Bot activities include content scraping, reconnaissance, credential stuffing, creating fake accounts, comment spamming, and similar activities. Bots can impose an unnecessary load on any company trying to serve web content efficiently. More importantly, they can attempt to “learn” the operation of a web site so as to exploit it. As one example, malicious software (malware) may execute a “man in the browser” attack by intercepting communications that a user makes with a web site in a manner that makes the user believe that he or she is actually communicating with the web site. For example, malware may generate a display for a user who is visiting a banking site, where the display requests from the user information such as social security number, credit card number, other account numbers. An organization that operates the malware may then have such data sent to it, and may use the data to steal from the user, the web site operator, or both.
- Various approaches have been taken to identify and prevent such malicious activity. For example, programs have been developed for operation on client computers or at the servers of the organizations that own and operate the client computer to detect improper activity.
- This document describes systems and techniques by which web code (e.g., HTML, CSS, and JavaScript) that a server system provides to client devices is modified before it is served over the internet, so as to make more difficult the exploitation of the code and the operator of the server system by clients that receive the code (including clients that are infected without their human users' knowledge). The modifications can be made to encode sensitive data, and may differ for different instances in which a web page and related content are served, whether to the same client computer or to different client computers. For example, a single expression or value in the code may be re-written as multiple expressions that, when executed, produce the initial value or expression. Where different code is served in response to each request, the expressions into which the initial value are resolved may also differ each time. The output of the code, when executed on the client computer, however, is the same for all such different versions of the served code so that a user at a client computer does not perceive a difference in the displayed web page. Specifically, two different users (or a single user in two different web browsing sessions) may be served slightly different code in response to the same requests, where the difference may be in implicit parts of the code that are not displayed so that the differences are not noticeable to the user or users.
- The manner in which an initial value or expression is rewritten into multiple expressions capable of being executed on a client computer may take a variety of forms For example, different expressions, different numbers of expressions, and different ordering of the execution of the expressions may all be varied to interfere with malware. Also, these different parameters may be varied so as to be different from one serving of the code to the next. Such variation, which may be termed “polymorphism” of the code, may help create a moving target against which malware needs to apply itself. In one example, changing the code that is served to client devices in an essentially random manner (i.e., a manner that effectively interferes with the ability of malware that has analyzed serving n from inferring something useful about serving n+x) each time the code is served can deter malicious code executing on the client computers (e.g., Man in the Browser bot) from interacting with the served code in a predictable way so as to trick a user of the client computer into providing confidential financial information and the like. Moreover, external programs generally cannot drive web application functionality directly, and so preventing predictable interaction with served code can be an effective mechanism for preventing malicious computer activity.
- As described here, the techniques transform values or expressions, such as a cleartext string, a Javascript object, or a Javascript code snippet into another Javascript snippet that is the equivalent to the input after it is executed (i.e., it produces an identical displayed output). The encoding is dynamic and random, which means that the encoding generates different output code each time given the same input (though the outputs may repeat periodically as long as that repetition is not frequent enough to allow malware to predict the output or readily obtain the repeated output). Because the encoded output code is still presented as cleartext, it may not be able to prevent a human from ascertaining sensitive data, but it may make it very difficult for a malicious party to write a computer program to extract the sensitive data automatically.
- Likewise, other forms of computer attacks can also be prevented or deterred by the web code transformations described in this document. Some of these attacks include: (a) denial of service attacks, and particularly advanced application denial of service attacks, in which a malicious party targets a particular functionality of a website (e.g., a widget or other web application) and floods the server with requests for that functionality until the server can no longer respond to requests from legitimate users; (b) rating manipulation schemes in which fraudulent parties use automated scripts to generate a large number of positive or negative reviews of some entity such as a marketed product or business in order to artificially skew the average rating for the entity up or down; (c) fake account creation in which malicious parties use automated scripts to establish and use fake accounts on one or more web services to engage in attacks ranging from content spam, e-mail spam, identity theft, phishing, ratings manipulation, fraudulent reviews, and countless others; (d) fraudulent reservation of rival goods, by which a malicious party exploits flaws in a merchant's website to engage in a form of online scalping by purchasing all or a substantial amount of the merchant's inventory and quickly turning around to sell the inventory at a significant markup; (e) ballot stuffing, in which automated bots are used to register a large number of fraudulent poll responses; (f) website scraping, in which both malicious parties and others (e.g., commercial competitors), use automated programs to obtain and collect data such as user reviews, articles, or technical information published by a website, and where the scraped data is used for commercial purposes that may threaten to undercut the origin website's investment in the scraped content; and (g) web vulnerability assessments, in which malicious parties scan any number of websites for security vulnerabilities by analyzing the web code and structure of each site.
- The systems, methods, and techniques for web code modifications described in this paper can, in certain implementations, prevent or deter one or more of these types of attacks. For example, transforming sensitive data by replacing expressions with a set of equivalent expressions and then interleaving the expressions in the set of equivalent expressions can cause the effectiveness of bots and other malicious automated scripts to be substantially diminished.
- The modification of code that is described in more detail below may be carried out by a security system that may supplement a web server system, and may intercept requests from client computers to the web server system and intercept responses from web servers of the system when they serve content back to the client computers (including where pieces of the content are served by different server systems). The modification may be of static code (e.g., HTML) and of related executable code (e.g., JavaScript) in combination. For example, the names of certain elements on a web page defined via HTML may be changed, as may references to items external to the HTML (e.g., CSS and JavaScript code). An expression may be rewritten as an equivalent expression or multiple expressions. For example, the expression “var y=2” may be rewritten as the following set of expressions: “var a=10”; “var b=8”; and “y=a−b”. As shown in this example, the combination of the three expressions in the set of expressions produces the same result as the original expression—that is, an assignment of the
value 2 to the variable y. Such rewriting, or transforming, of code may occur by first identifying data present in code that is to be served to the client computer (e.g., HTML, CSS, and JavaScript) and grouping such occurrences of sensitive data for further processing (e.g., by generating flags that point to each such element or copying a portion of each such element). The identified data may be identified as sensitive or potentially sensitive or simply data that should be rewritten before being served. Processing of the data may occur by modifying each element throughout different formats of code, such as changing an expression in the manner above each time that name occurs in a parameter, method call, DOM operation, or elsewhere. Next, further processing may occur that comprises interleaving the set of elements throughout the new code. Such a process may be repeated each time a client computer requests code, and the modifications may be different for each serving of the same code. - In certain instances, the analysis to identify values or expressions that can be rewritten without affecting the operation of the code may be performed once, and a map to occurrences of such values or expressions in the mode may be generated, and then used for each serving of the code to locate the occurrences, so that they may be altered throughout the code in a consistent manner that does not break the code. Such analyze-once, transform-many approaches may lessen the computational load for such a system and allow greater scaling of the system to larger web server systems with high volume requirements.
- Such modification of the served code can help to prevent bots or other malicious code from exploiting or even detecting weaknesses in the web server system. For example, the names of functions or variables may be changed in various random manners each time a server system serves the code. As noted above, such constantly changing modifications may interfere with the ability of malicious parties to identify how the server system operates and web pages are structured, so that the malicious party cannot generate code to automatically exploit that structure in dishonest manners. Such techniques may create a moving target that can prevent malicious organizations from reverse-engineering the operation of a web site so as to build automated bots that can interact with the web site, and potentially carry out Man-in-the-Browser and other Man-in-the-Middle operations and attacks.
- The techniques discussed here may be carried out by a server subsystem that acts as an adjunct to a web server system that is commonly employed by a provider of web content. For example, as discussed in more detail below, an internet retailer may have an existing system by which it presents a web storefront at a web site (e.g., www.examplestore.com), interacts with customers to show them information about items available for purchase through the storefront, and processes order and payment information through that same storefront. The techniques discussed here may be carried out by the retailer adding a separate server subsystem (either physical or virtualized) that stands between the prior system and the internet. The new subsystem may act to receive web code from the web servers (or from a traffic management system that receives the code from the web servers), may translate that code in random manners before serving it to clients, may receive responses from clients and translate them in the opposite direction, and then provide that information to the web servers using the original names and other data. In addition, such a system may provide the retailer or a third party with whom the retailer contracts (e.g., a web security company that monitors data from many different clients and helps them identify suspect or malicious activity) with information that identifies suspicious transactions. For example, the security subsystem may keep a log of abnormal interactions, may refer particular interactions to a human administrator for later analysis or for real-time intervention, may cause a financial system to act as if a transaction occurred (so as to fool code operating on a client computer) but to stop such a transaction, or any number of other techniques that may be used to deal with attempted fraudulent transactions.
- In one implementation, a computer-implemented method is disclosed that includes identifying a piece of data for serving from a server system to a client device that is remote from the server system, the piece of data being part of executable code requested from the server from the client device; creating a plurality of expressions that, when executed, provide a result that corresponds to the piece of data; and providing, to the client device and as part of the executable code, the plurality of expressions along with code for executing the plurality of expressions, so that when the plurality of expressions are executed on the client device, the identified piece of data is returned on the client device without a need to serve the identified piece of data to the client device. The method can include performing a permutation on the plurality of expressions so that the plurality of expressions or ordered in the executable code in an order different than they were created. The order of the expressions can be selected randomly as part of the permutation.
- In some aspects, the method can include creating one or more additional expressions whose executed results are not used by other code that is part of the executable code served to the client device; and providing to the client device the plurality of expressions with the one or more additional expressions. Also, the method can include identifying, in the piece of data, data that needs to be kept away from malware that may be in the client device, and wherein creating a plurality of expressions comprises creating one or more replacement statements that when executed, provide a result that corresponds to the potentially sensitive data. The replacement statements can comprise one or more expressions that do not execute on the client device when the executable code is executed.
- In certain aspects, the method can further include identifying, in the piece of data, a first expression and a second expression to be replaced, wherein creating a plurality of expressions comprises creating a first set of replacement expressions corresponding to the first expression and a second set of expressions corresponding to the second expressions; and interleaving the replacement expressions of the first set of replacement expressions with the replacement expressions of the second set of replacement expressions, wherein the plurality of expressions provided to the client device comprise the interleaved replacement expressions.
- In other aspects, creating a plurality of expressions comprises creating a first set of replacement expressions; identifying a first replacement expression in the first set of replacement expressions; creating a second set of replacement expressions that, when executed, provide a result that corresponds to the first replacement expression; and replacing the first replacement expression with the second set of replacement expressions. The piece of data to be served comprises formats of code in HTML, CSS, and JavaScript, and wherein each of the formats interoperates with the other formats.
- In another implementation, a computer-implemented method is disclosed that comprises receiving, from a server system, web content comprising original code, wherein the web content is requested by a client device that is remote from the server system; identifying a piece of data in the code; creating a plurality of expressions that, when executed, provide a result that corresponds to the piece of data; generating modified code comprising the original code with the piece of data replaced with the plurality of expressions; and providing the modified code to the client device, wherein, when executed, the modified code provides a result that corresponds to the original code. In some aspects, generating modified code comprises interleaving the plurality of expressions into the original code with the identified piece of data removed. Also, in some aspects, the plurality of expressions is created in a first ordering, and the plurality of expressions is interleaved into the original code so that the plurality of expressions maintains the first ordering. In other aspects, the plurality of expressions are created in a first ordering, and the plurality of expressions are interleaved into the original code so that the plurality of expressions are in a second ordering that is different than the first ordering. In yet other aspects, the plurality of expressions includes one or more junk expressions that do not execute. In some aspects, the method further comprises selecting a first expression among the plurality of expressions; and creating a second plurality of expressions that, when executed, provide a result that corresponds to the selected first expression, wherein the generated modified code comprises the original code with the piece of data replaced with the plurality of expressions, with the selected first expression replaced with the second plurality of expressions.
- In another implementation, a computer system for recoding web content served to client computers is disclosed that comprises an interface for receiving information from a web server system configured to provide computer code in multiple different formats in response to requests from client computing devices; and a security intermediary that is arranged to (i) receive the computer code from the interface before the computer code is provided to the client computing devices, (ii) identify a piece of data in the computer code that is to be replaced; (iii) create a plurality of expressions that, when executed, provide a result that corresponds to the piece of data; and (iv) provide the plurality of expressions to the client computing devices with code for executing the plurality of expressions. In some aspects, the piece of data in the computer code that is to be replaced is identified as potentially sensitive data. In other aspects, the security intermediary is further arranged to perform a permutation of the plurality of expressions. In yet other aspects, the plurality of expressions comprise one or more expressions that do not execute. In yet another aspect, the security intermediary is further arranged to interleave the plurality of expressions with the code of executing the plurality of expressions.
- Other features and advantages will be apparent from the description and drawings, and from the claims.
-
FIG. 1 is a conceptual diagram showing transformation of a value into multiple expressions that execute back to the value. -
FIG. 1A depicts a general overview of a system for requesting, modifying, and serving web content. -
FIG. 1B depicts a schematic diagram of an encoding system that modifies requested web content. -
FIG. 2 depicts an overview of a method for modifying program code. -
FIG. 3A-3G depict various examples for modifying code for web content. -
FIG. 4 is a flow diagram of a process for serving modified, or encoded, web content. -
FIG. 5 shows a system for serving polymorphic code. -
FIG. 6 is a schematic diagram of a general computing system. - Like reference numbers and designations in the various drawings indicate like elements.
-
FIG. 1 is a conceptual diagram showing transformation of a value into multiple expressions that execute back to the value. In general, the diagram attempts to show at a high level how initial representations in code can be rewritten as multiple representations that together can be executed on a client device to return the initial representation. The multiple representations, however, can be difficult for automated malware to analyze because they can not easily be matched to a template, can be scattered throughout the code in appropriate circumstances, and can be constantly changed, both in their values and in their ordered and locations in the code. - The diagram depicts a process, flowing from left-to-right. The process starts with a
value 102, which may take a variety of forms. The value may be a simple string or number in plaintext form. Such value may be found by analysis of web code served by a web server system and provided to an intermediate security system that is tasked with recoding portions of the served code where the recoding will not affect the functionality of the code when it is executed on client devices. - At 104, the intermediate security system identifies a relatively complex expression that will resolve to the value. For clarity of explanation, the expression is shown here in the form of a pseudo-equation. In the equation, operations are shown as a box surrounding a dot, to represent that any appropriate operation may be used. Parentheses are used to indicate grouping of operations, and the ability to have the relative groups combined with each other out of the order they are shown in the equation. Thus, at 106, the three main groups are each converted into code snippets to represent the relevant sub-expressions, and then the order in which those sub-expressions are evaluated is changed—where the second grouping from the formula is evaluated first in the code, then the first, and then the third. Additional code may be generated to evaluate the results of the three groupings together with each other.
- The code generated at 106 may then be inserted into the code received from the web server system and may be served to a client device that requested the code. At 108, that code is executed at the client device, such as using a web browser, and such execution generates the
initial value 108 or a value that is equivalent to the initial value. In subsequent servings of the code, the value “T” may be resolved into very different lines of code and expressions. - In this manner, then, the process shown here is able to replace original code with different code that serves as a proxy for the original code, and that reaches the same result as the original code when it is executed by the standard environment (e.g., standard JavaScript run-time) on a client device.
-
FIG. 1A depicts an overview of asystem 100 for encoding web content served from aweb server 122 to a polymorphic encoding system 124 (or simply, encoding system) and to aweb browser 126. In general, thesystem 100 represents a high-level depiction of the system inFIG. 1 . - The
polymorphic encoding system 124 receives web content from aweb server 122 that is to be served to aweb browser 126 at, for example, a client device. Prior to serving the web content to theweb browser 126, thepolymorphic encoding system 124 identifies and encodes potentially sensitive data. Web content that is handled by thesystem 100 may include, for example, HTML, CSS, JavaScript, and other program code associated with the content or transmission of web resources such as a web page that may be presented at a client computer (or many different requesting client computers). -
FIG. 1B depicts various parts of theencoding system 124 ofFIG. 1A . In general, these components operate to transform incoming computer code so as to convert values or expressions into multiple additional expressions that resolve to the original values or expressions when they are executed as part of the code. - In the figure, a
sensitive data identifier 110 parses code to identify sensitive data or potentially sensitive data, including data that can be recoded without affecting the functionality of the code when it is executed. In some instances,data identifier 110 may broadly identify data that is to be replaced, regardless of whether the data is identified as sensitive in nature. In this example, program code P may comprise statements S1, S2, S3, and S4. In this example, thesensitive data identifier 110 identifies statement S1 as potentially sensitive data. - Various methods may be used for identifying potentially sensitive data. For example, data associated with a form to be filled out or with particular fields or fieldnames in a form may be identified. Also, an operator of a security system may study the code served by a particular organization and may flag particular fields or other elements that are frequently served by the organization and are of a sensitive nature. The
sensitive field identifier 110 may then use a list of fields or other information generated by such an analysis to locate sensitive fields in other pages of web code to be served by the same organization. - The sensitive data from the web server may be typically presented in cleartext form. A
replacement code generator 112 generates code that replaces such potentially sensitive data. The generated code, when executed, generates the same output as the originally-identified potentially sensitive code. In this example,replacement code generator 112 generates four statements E1, E2, E3, and E4 that, when executed, produce the same output as statement S1.Interleaver 114 takes the replacement code statements E1, E2, E3, and E4, and interleaves the replacement code statements into other programmatic statements that are already part of program P, or statements that have been generated as replacement code for other statements in the code. The interleaving process may be random (though avoiding any placement that would break the code) and may result in a different ordering of statements in response to two different requests. The resulting program with the interleaved statements, when executed, produces the same functional output as program P. - The data transferred from the
encoding system 124 to theweb browser 126 may be, for example, in the form of obfuscated JavaScript code with the sensitive data hidden within the code. Specific example methods for encoding the sensitive data are described below with respect toFIGS. 3A-3G . When the sensitive data passes through theencoding system 124, theencoding system 124 identifies and extracts the sensitive data and then modifies the code. The modified code is then incorporated in the original web content, replacing the sensitive data. -
FIG. 2 depicts an example of howprogram code P 202 may be modified, or encoded, into program code P′ 206. In general, the illustrated process involves identifying a number of operations or statements that may be joined together and transformed into code that, when executed under a standard programming environment (e.g., a standard run-time implementation), will produce an original starting value or expression. - While the code of
program P 202 and program P′ 206 are not identical, the output of each ofprogram P 202 and program P′ 206, when executed (e.g., via a web browser), are the same.Program P 202 represents any appropriate web content, such as HTML, CSS, JavaScript, and other program code.Program P 202 comprises a set of n statements, {S1, S2, S3, . . . , Sn}. The statement Si in the set of statements may be potentially sensitive data or content that is confirmed to be sensitive in nature. In some instances, the statement Si in the set of statements may include statements that are identified as needing to be replaced. Each statement, Si, may be a line of code or expression in the program. InStep 1, each of the statements, Si, is rewritten as a set of statements {Si1, Si2, Si3, Si4 . . . } that, collectively, is executed as the equivalent of the individual statement Si, as described in further detail below with respect toFIGS. 3A-3F . In this manner, afterStep 1 is complete for each statement, Si, inProgram P 202, a set ofequivalent statements E 204 forProgram P 202 is generated. That is, the set ofequivalent statements E 204 comprises n sets of equivalent statements for each statement Si, inProgram P 202. For example, the statement Si is replaced with the set of equivalent statements {S11, S12, S13, S14 . . . }. The number of statements in each set of equivalent statements need not be the same. Then, inStep 2, the various statements in the set ofequivalent statements E 204 are interleaved, as described below with respect toFIG. 3G . -
FIGS. 3A-3F show examples of equivalent statement replacement. In general, the figures show manners in which a single line of code for expressing a variable-assigning and/or mathematical relationship can be expressed instead by multiple lines of code that can be executed in a particular order to reach the initial result. Although numeric functions are shown in the examples, here, other data may be similarly treated. For example, an alphanumeric string may be transformed through multiple operations, so that the starting point is a string different than what the web server system provided, but that ends up in generating the same string when the code is executed by a web browser. - Referring to
FIG. 3A , a constant number in Javascript can be replaced by an equivalent Javascript expression. There exist numerous ways to replace a constant number. For example, thenumber 1 can be written as (3−2) or as (0+1), and the number 24 can be written as the expression (4*6) or (30−6). To generate essentially random expressions for number y, the system can first use a random generator to generate a random number, for example, x, and then replace the number y with (x−(x−y)), which appears in the code to be different than y but is functional equivalent of y when executed.FIG. 3A shows an example of this equivalent statement replacement where the statement “var y=2” is replaced with the set of statements {var a=10; var b=8; y=a−b}. WhileFIG. 3A shows a sum (or subtraction) operation, generally, any appropriate type of JavaScript operation (e.g., sum, multiply, divide) or function that returns a number (e.g., String.length( ), Array.length( )) may be used. Other types of constants, such as Boolean, string, array, and object, can be replaced with equivalent statements using a similar approach. -
FIG. 3B shows an example of an equivalent statement replacement of a Boolean operation, as one such example. Where the initial statement “var y=true” sets the value of variable y as “true,” the set of equivalent statements comprises a set of three statements, which, when executed, equivalently result in the variable y being set as “true.” In this example, the first two statements assign thevalues -
FIG. 3C shows an example of an equivalent statement replacement of a string constant. In the example, the statement “var y=‘abc’” is replaced with an equivalent set of statements comprising three statements. Collectively, when the equivalent set of statements is executed, the variable y is set as “abc”. As with the examples discussed with respect toFIGS. 3A and 3B , an nearly infinite number of combinations exist to replace the initial statement. In other implementations, a string or the characters in a string may be assigned numeric values, and the operations performed on the numeric examples in the figures above and below may be applied, and then the resulting numeric value or values may be converted back into alphanumeric characters for a string. For example, the letter “a” may be assigned a value of 256 in a particular font definition, and the techniques discussed here may be used to break the value of 256 up into a plurality of expressions. When those expressions are executed at the client device, the number 256 may be returned, and then may be rendered as a glyph for the character set, as an “a.” - Another method of equivalent statement replacement involves adding junk code or junk branches, and may be applied as an alternative or additionally to the other examples discussed here. The purpose of adding the junk code is to add “noise” to the code so that potential hackers or attackers cannot use the position of expressions in the code (e.g, line number or the nth statement) to locate a key function or variable. Junk code may be one or more statements that execute but have no effect on the execution of the rest of the program or the operation of the program. For instance, two simple assignment statements “j=4” and “k=3+j” may be a part of the set of statements that the replacement code generator creates. However, variables j and k not be present anywhere else in the program code, so that while the new code causes j to be assigned the value of 4 and k to be assigned the value of 7, variables j and k are not used anywhere else in the code and do not otherwise affect the operation or execution of the program code.
- Alternatively, junk branches can be generated to add a level of obfuscation to the code.
FIG. 3D shows an example of adding a junk branch to the statement “var y=2”. Because the conditional statement “if (3<2)” is always false, junk branch “y=1” will never execute (e.g., the assignment of the value of 1 to y will never occur). The junk code or junk branches make it difficult for an attacker program to discern sensitive data from junk code or junk branches without more carefully parsing or analyzing the web content. -
FIG. 3E shows an example of permutations of statements for a scenario in which equivalent statements do not require a strict order. For example, the statement “var y=[1, 2, 3]” creates an array with three elements. The collectively equivalent four statements “var y=[ ]”, “y[1]=2”, “y[2]=3”, “y[0]=1” create the same array y regardless of the order that the latter three statements are executed. Where the order of the code does not matter, a larger number of potential random ways exist to rewrite the code. Specifically, for N such statements for which order does not matter, there are N! possible permutations. In the example shown inFIG. 3E , because there are three statements where the order is irrelevant, there are six possible ways to generate equivalent code in this manner. Permutation of statements may be used on an array, list string, or other collection data structure. - The encoding system may further employ recursive encoding.
FIG. 3F illustrates an example of random recursive encoding where a constant is replaced with a random expression and then a junk branch is then added to one replacement statement. Specifically, first, a single statement “var y=2” is replaced with three separate statements. Then, one of the three statements is then replaced with other statements that add a junk branch, similar to the example shown inFIG. 3D . Additional recursive encoding may be employed. After applying random basic polymorphic encoding approaches a random number of times, a simple assignment statement, such as “var y=2” can be transformed into hundreds of lines of code. -
FIGS. 3A-3F illustrated various methods of producingequivalent statements 204. After performing such methods, an encoded program P′ 206 can be generated by interleaving theequivalent statements E 204 with the rest of the code, as shown, for example, inFIG. 3G (and potentially identifying which of the statements can be included in either order relative to other of the statements). InFIG. 3G , twoseparate statements program code P 202. In a first step, a set ofreplacement statements 376 are generated forstatement 372 and a set ofreplacement statements 378 are generated forstatement 374. Then, in a second step, the individual expressions of the set ofreplacement statements 376 are interleaved with the individual expressions of the set ofreplacement statements 378 to form an encoded form of theoriginal web content 380.FIG. 3G shows an example where the order of the statements in each of the sets ofreplacement statements replacement statements -
FIG. 4 is a flow diagram of a process for serving modified program code. In general, the process involves identifying items in content to be served to a client computer that may potentially include sensitive data, transforming the data dynamically and randomly into a set of other data, and incorporating the set of other data into the content in a manner so as to hide the potentially sensitive data. - The process begins at
box 402, where a request for web content is received, such as from a client computer operated by an individual seeking to perform a banking transaction at a website for the individual's bank. The request may be in the form of an HTTP request and may be received by a load balancer operated by, or for, the bank. The load balancer may recognize the form of the request and understand that it is to be handled by a security system that the bank has installed to operate along with its web server system. The load balancer may thus provide the request to the security system, which may forward it to the web server system after analyzing the request (e.g., to open a tracking session based on the request), or may provide the request to the web server system and also provide information about the request to the security system in parallel. - At
box 404, a response to the request is generated by the web server system. For example, the user may have requested to perform a funds transfer between accounts at the bank, where the funds are owned by the individual, and the response by the web server system may include HTML for a webpage on which the user can specify parameters for the transaction, along with JavaScript code and CSS code for carrying out such transactions at a web browser operated by the individual. - At
box 406, the web server system sends the response to the request to an encoding system. The response may comprise the web content requested by the client computer. Included in the response may be potentially sensitive data, such as, for example, or account numbers, routing numbers, or other data relating to a banking transaction. Atbox 408, the encoding system receives the web content from the web server system and identifies potentially sensitive data in the web content. - At
box 410, the encoding system generates code to replace the sensitive data. The sensitive data may be written as a set of replacement statements, which, when executed, are displayed the same as the sensitive data, resulting in no difference in appearance to a user requesting the web content. Various methods for rewriting or replacing the sensitive data are possible, including the methods described above with respect toFIGS. 3A-3F . The replacement of sensitive data may include replacing a single statement or expression in the web content, or it may include replacing numerous statements. At a minimum, however, a single statement, or expression, is replaced with a set of equivalent statements. The set equivalent statements may comprise one or more statements, which, when collectively executed, output the same result as the initial statement comprising sensitive data. - In some instances, the encoding system may identify a single statement assigning a constant value to contain sensitive data. In response, the encoding system may randomly generate a set of equivalent statements, which, collectively, make the same assignment, as illustrated, for example, in
FIG. 3A . In other instances, the encoding system may identify a single statement containing sensitive data. In response, the encoding system may add one or more lines of junk code or junk branches. The purpose of the junk code is to add a layer of randomness to the code to prevent potential hackers from using the position (e.g., line number) of code to identify potentially sensitive data. When executed, the junk code has no visible effect on the displayed web content. Similarly, the encoding system may generate junk branches that appear to supplement the original statement or expression in the code. In some instances, the junk branches may comprise conditional statements or expressions that will never execute. An example of generating a junk branch is discussed above with respect toFIG. 3D . In that example, the conditional statement “if (3<2)” will never be true, so the assignment statement “y=1” will not occur and instead, thevalue 2 will always be assigned to y. Adding junk branches adds an additional level of obfuscation to the code that hampers a potential attacker's ability to target sensitive data. - In some instances, the encoding system may employ recursive coding, generating multiple “layers” of replacement code. An example of recursive coding is shown, for example, in
FIG. 3F . In a first step, an assignment statement is replaced with three separate statements. In a second step, a junk branch is added to one of the three replacement statements. While the example shown inFIG. 3F shows only two “layers” of replacement code, any number of “layers” of replacement code may be generated. - After the encoding system generates replacement code, the method moves to
box 412 where the various replacement statements are interleaved in the code of the web content. An example of the interleaving process is described above with respect toFIG. 3G . In that example, the encoder system first generates two sets ofreplacement code replacement code replacement code set 376, statements “var a=10” and “var b=8” can be executed in any order with respect to one another but must be executed before the statement “x=a−b” is executed. Where the order of the statements is irrelevant, more variations of interleaved code are possible. - In some instances, the encoding system randomly and dynamically generates code to replace the sensitive data. That is, given the same input code (i.e., web content), the encoding system does not necessarily generate the same replacement code in response to two different requests for the web content. Furthermore, the approach for generating the replacement code may be different in response to two requests for the same web content. For example, in response to one request, the encoder system may replace a first statement with a set of three replacement statements that collectively result in the same result as the first statement, such as the example shown in
FIG. 3A . In response to a second request for the same web content, the encoder system may replace the same first statement with a set of three different replacement statements that include a junk branch, such as the example shown inFIG. 3D . In both examples, the first statement to be replaced (presumably containing potentially sensitive data) is “var y=2”, but the encoder system generates two different sets of replacement statements. In another example, the encoder may generate the same two sets of replacements statements in response to a first statement (e.g., “var y=2”), such as the replacement statements shown inFIG. 3A , but it may then interleave the code in a different order so that the resulting web code produced in response to the two requests are identical. - The process then serves the recoded web content at
box 414, in familiar manners. Such a process may be performed repeatedly each time a client computer requests content, with the recoded content being different each time the content is served through the encoding system, including when identical or nearly identical content is requested in separate transactions by two different users or by the same user. - In addition, the code that is served by the encoding system may be supplemented with instrumentation code that runs on the computer browser and monitors interaction with the web page. For example, the instrumentation code may look for particular method calls or other calls to be made, such as when the calls or actions relate to a field in a form that is deemed to be subject to malicious activity, such as a client ID number field, a transaction account number field, or a transaction amount field. When the instrumentation code observes such activity on the client device, it will report that activity along with metadata that helps to characterize the activity, the process receives such reports from the instrumentation code and processes them, such as by forwarding them to a central security system that may analyze them to determine whether such activity is benign or malicious.
-
FIG. 5 shows asystem 500 for serving polymorphic and instrumented code. Generally, polymorphic code is code that is changed in different manners for different servings of the code, in manners that do not affect the way in which the executed code is perceived by users. The goal is to create a moving target for malware that tries to determine how the code operates, but without changing the user experience. Instrumented code is code that is served, e.g., to a browser, with the main functional code and monitors how the functional code operates on a client device, and how other code may interact with the functional code and other activities on the client device. In certain implementations, thesystem 500 may identify values or expressions in the code that can be replaced with multiple other expressions that, when executed on a client device, resolve to the initial value or expressions. - The
system 500 may be adapted to perform deflection and detection of malicious activity with respect to a web server system. Deflection may occur, for example, by the serving of polymorphic code, which interferes with the ability of malware to interact effectively with the code that is served. Detection may occur, for example, by adding instrumentation code (including injected code for a security service provider) that monitors activity of client devices that are served web code. - The
system 500 in this example is a system that is operated by or for a large number of different businesses that serve web pages and other content over the internet, such as banks and retailers that have on-line presences (e.g., on-line stores, or on-line account management tools). The main server systems operated by those organizations or their agents are designated as web servers 504 a-504 n, and could include a broad array of web servers, content servers, database servers, financial servers, load balancers, and other necessary components (either as physical or virtual servers). - In this example,
security server systems 502 a to 502 n may cause code from the web server system to be supplemented and altered. In one example of the supplementation, code may be provided, either by the web server system itself as part of the originally-served code, or by another mechanism after the code is initially served, such as by thesecurity server systems 502 a to 502 n, where the supplementing code causes client devices to which the code is served to transmit data that characterizes the client devices and the use of the client devices. As also described below, other actions may be taken by the supplementing code, such as the code reporting actual malware activity or other anomalous activity at the client devices that can then be analyzed to determine whether the activity is malware activity. - The set of
security server systems 502 a to 502 n is shown connected between theweb servers 504 a to 504 n and anetwork 510 such as the internet. Although both extend to n in number, the actual number of sub-systems could vary. For example, certain of the customers could install two separate security server systems to serve all of their web server systems (which could be one or more), such as for redundancy purposes. The particular security server systems 502 a-502 n may be matched to particular ones of the web server systems 504 a-504 n, or they may be at separate sites, and all of the web servers for various different customers may be provided with services by a single common set of security servers 502 a-502 n (e.g., when all of the server systems are at a single co-location facility so that bandwidth issues are minimized). - Each of the security server systems 502 a-502 n may be arranged and programmed to carry out operations like those discussed above and below and other operations. For example, a
policy engine 520 in each such security server system may evaluate HTTP requests from client computers (e.g., desktop, laptop, tablet, and smartphone computers) based on header and network information, and can set and store session information related to a relevant policy. The policy engine may be programmed to classify requests and correlate them to particular actions to be taken to code returned by the web server systems before such code is served back to a client computer. When such code returns, the policy information may be provided to a decode, analysis, and re-encode module, which matches the content to be delivered, across multiple content types (e.g., HTML, JavaScript, and CSS), to actions to be taken on the content (e.g., using XPATH within a DOM), such as substitutions, addition of content, and other actions that may be provided as extensions to the system. For example, the different types of content may be analyzed to determine naming that may extend across such different pieces of content (e.g., the name of a function or parameter), and such names may be changed in a way that differs each time the content is served, e.g., by replacing a named item with randomly-generated characters. Elements within the different types of content may also first be grouped as having a common effect on the operation of the code (e.g., if one element makes a call to another), and then may be re-encoded together in a common manner so that their interoperation with each other will be consistent even after the re-encoding. - Both the analysis of content for determining which transformations to apply to the content, and the transformation of the content itself, may occur at the same time (after receiving a request for the content) or at different times. For example, the analysis may be triggered, not by a request for the content, but by a separate determination that the content newly exists or has been changed. Such a determination may be via a “push” from the web server system reporting that it has implemented new or updated content. The determination may also be a “pull” from the security servers 502 a-502 n, such as by the security servers 502 a-502 n implementing a web crawler (not shown) to recursively search for new and changed content and to report such occurrences to the security servers 502 a-502 n, and perhaps return the content itself and perhaps perform some processing on the content (e.g., indexing it or otherwise identifying common terms throughout the content, creating DOMs for it, etc.). The analysis to identify portions of the content that should be subjected to polymorphic modifications each time the content is served may then be performed according to the manner discussed above and below.
- A
rules engine 522 may store analytical rules for performing such analysis and for re-encoding of the content. Therules engine 522 may be populated with rules developed through operator observation of particular content types, such as by operators of a system studying typical web pages that call JavaScript content and recognizing that a particular method is frequently used in a particular manner. Such observation may result in therules engine 522 being programmed to identify the method and calls to the method so that they can all be grouped and re-encoded in a consistent and coordinated manner. - The decode, analysis, and
re-encode module 524 encodes content being passed to client computers from a web server according to relevant policies and rules. Themodule 524 also reverse encodes requests from the client computers to the relevant web server or servers. For example, a web page may be served with a particular parameter, and may refer to JavaScript that references that same parameter. The decode, analysis, andre-encode module 524 may replace the name of that parameter, in each of the different types of content, with a randomly generated name, and each time the web page is served (or at least in varying sessions), the generated name may be different. When the name of the parameter is passed back to the web server, it may be re-encoded back to its original name so that this portion of the security process may occur seamlessly for the web server. - A key for the function that encodes and decodes such strings can be maintained by the security server system 502 along with an identifier for the particular client computer so that the system 502 may know which key or function to apply, and may otherwise maintain a state for the client computer and its session. A stateless approach may also be employed, whereby the system 502 encrypts the state and stores it in a cookie that is saved at the relevant client computer. The client computer may then pass that cookie data back when it passes the information that needs to be decoded back to its original status. With the cookie data, the system 502 may use a private key to decrypt the state information and use that state information in real-time to decode the information from the client computer. Such a stateless implementation may create benefits such as less management overhead for the server system 502 (e.g., for tracking state, for storing state, and for performing clean-up of stored state information as sessions time out or otherwise end) and as a result, higher overall throughput.
- The decode, analysis, and
re-encode module 524 and the security server system 502 may be configured to modify web code differently each time it is served in a manner that is generally imperceptible to a user who interacts with such web code. For example, multiple different client computers may request a common web resource such as a web page or web application that a web server provides in response to the multiple requests in substantially the same manner. Thus, a common web page may be requested from a web server, and the web server may respond by serving the same or substantially identical HTML, CSS, JavaScript, images, and other web code or files to each of the clients in satisfaction of the requests. In some instances, particular portions of requested web resources may be common among multiple requests, while other portions may be client or session specific. The decode, analysis, andre-encode module 524 may be adapted to apply different modifications to each instance of a common web resource, or common portion of a web resource, such that the web code that it is ultimately delivered to the client computers in response to each request for the common web resource includes different modifications. - In certain implementations, the analysis can happen a single time for a plurality of servings of the code in different recoded instances. For example, the analysis may identify a particular function name and all of the locations it occurs throughout the relevant code, and may create a map to each such occurrence in the code. Subsequently, when the web content is called to be served, the map can be consulted and random strings may be inserted in a coordinated matter across the code, though the generation of a new name each time for the function name and the replacement of that name into the code, will require much less computing cost than would full re-analysis of the content. Also, when a page is to be served, it can be analyzed to determine which portions, if any, have changed since the last analysis, and subsequent analysis may be performed only on the portions of the code that have changed.
- Even where different modifications are applied in responding to multiple requests for a common web resource, the security server system 502 can apply the modifications in a manner that does not substantially affect a way that the user interacts with the resource, regardless of the different transformations applied. For example, when two different client computers request a common web page, the security server system 502 applies different modifications to the web code corresponding to the web page in response to each request for the web page, but the modifications do not substantially affect a presentation of the web page between the two different client computers. The modifications can therefore be made largely transparent to users interacting with a common web resource so that the modifications do not cause a substantial difference in the way the resource is displayed or the way the user interacts with the resource on different client devices or in different sessions in which the resource is requested.
- An
instrumentation module 526 is programmed to add instrumentation code to the content that is served from a web server. The instrumentation code is code that is programmed to monitor the operation of other code that is served. For example, the instrumentation code may be programmed to identify when certain methods are called, when those methods have been identified as likely to be called by malicious software. When such actions are observed to occur by the instrumentation code, the instrumentation code may be programmed to send a communication to the security server reporting on the type of action that occurred and other meta data that is helpful in characterizing the activity. Such information can be used to help determine whether the action was malicious or benign. - The instrumentation code may also analyze the DOM on a client computer in predetermined manners that are likely to identify the presence of and operation of malicious software, and to report to the security servers 502 or a related system. For example, the instrumentation code may be programmed to characterize a portion of the DOM when a user takes a particular action, such as clicking on a particular on-page button, so as to identify a change in the DOM before and after the click (where the click is expected to cause a particular change to the DOM if there is benign code operating with respect to the click, as opposed to malicious code operating with respect to the click). Data that characterizes the DOM may also be hashed, either at the client computer or the server system 502, to produce a representation of the DOM (e.g., in the differences between part of the DOM before and after a defined action occurs) that is easy to compare against corresponding representations of DOMs from other client computers. Other techniques may also be used by the instrumentation code to generate a compact representation of the DOM or other structure expected to be affected by malicious code in an identifiable manner.
- As noted, the content from web servers 504 a-504 n, as encoded by decode, analysis, and
re-encode module 524, may be rendered on web browsers of various client computers. Uninfected client computers 513A-512 n represent computers that do not have malicious code programmed to interfere with a particular site a user visits or to otherwise perform malicious activity. Infected client computers 514 a-514 n represent computers that do have malware or malicious code (518 a-518 n, respectively) programmed to interfere with a particular site a user visits or to otherwise perform malicious activity. In certain implementations, the client computers 513A-512 n, 514 a-514 n may also store the encrypted cookies discussed above and pass such cookies back through thenetwork 510. The client computers 512A-512 n, 514 a-514 n will, once they obtain the served content, implement DOMs for managing the displayed web pages, and instrumentation code may monitor the respective DOMs as discussed above. Reports of illogical activity (e.g., software on the client device calling a method that does not exist in the downloaded and rendered content) can then be reported back to the server system. - The reports from the instrumentation code may be analyzed and processed in various manners in order to determine how to respond to particular abnormal events, and to track down malicious code via analysis of multiple different similar interactions across different client computers 512A-512 n, 514 a-514 n. For small-scale analysis, each web site operator may be provided with a
single security console 507 that provides analytical tools for a single site or group of sites. For example, theconsole 507 may include software for showing groups of abnormal activities, or reports that indicate the type of code served by the web site that generates the most abnormal activity. For example, a security officer for a bank may determine that defensive actions are needed if most of the reported abnormal activity for its web site relates to content elements corresponding to money transfer operations-an indication that stale malicious code may be trying to access such elements surreptitiously. -
Console 507 may also be multiple different consoles used by different employees of an operator of thesystem 500, and may be used for pre-analysis of web content before it is served, as part of determining how best to apply polymorphic transformations to the web code. For example, in combined manual and automatic analysis like that described above, an operator atconsole 507 may form or applyrules 522 that guide the transformation that is to be performed on the content when it is ultimately served. The rules may be written explicitly by the operator or may be provided by automatic analysis and approved by the operator. Alternatively, or in addition, the operator may perform actions in a graphical user interface (e.g., by selecting particular elements from the code by highlighting them with a pointer, and then selecting an operation from a menu of operations) and rules may be written consistent with those actions. - A
central security console 508 may connect to a large number of web content providers, and may be run, for example, by an organization that provides the software for operating the security server systems 502A-502 n.Such console 508 may access complex analytical and data analysis tools, such as tools that identify clustering of abnormal activities across thousands of client computers and sessions, so that an operator of theconsole 508 can focus on those clusters in order to diagnose them as malicious or benign, and then take steps to thwart any malicious activity. - In certain other implementations, the
console 508 may have access to software for analyzing telemetry data received from a very large number of client computers that execute instrumentation code provided by thesystem 500. Such data may result from forms being re-written across a large number of web pages and web sites to include content that collects system information such as browser version, installed plug-ins, screen resolution, window size and position, operating system, network information, and the like. In addition, user interaction with served content may be characterized by such code, such as the speed with which a user interacts with a page, the path of a pointer over the page, and the like. - Such collected telemetry data, across many thousands of sessions and client devices, may be used by the
console 508 to identify what is “natural” interaction with a particular page that is likely the result of legitimate human actions, and what is “unnatural” interaction that is likely the result of a bot interacting with the content. Statistical and machine learning methods may be used to identify patterns in such telemetry data, and to resolve bot candidates to particular client computers. Such client computers may then be handled in special manners by thesystem 500, may be blocked from interaction, or may have their operators notified that their computer is potentially running malicious software (e.g., by sending an e-mail to an account holder of a computer so that the malicious software cannot intercept it easily). -
FIG. 6 is a schematic diagram of ageneral computing system 600. Thesystem 600 can be used for the operations described in association with any of the computer-implement methods described previously, according to one implementation. Thesystem 600 is intended to include various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Thesystem 600 can also include mobile devices, such as personal digital assistants, cellular telephones, smartphones, and other similar computing devices. Additionally the system can include portable storage media, such as, Universal Serial Bus (USB) flash drives. For example, the USB flash drives may store operating systems and other applications. The USB flash drives can include input/output components, such as a wireless transmitter or USB connector that may be inserted into a USB port of another computing device. - The
system 600 includes aprocessor 610, amemory 620, astorage device 630, and an input/output device 640. Each of thecomponents system bus 650. Theprocessor 610 is capable of processing instructions for execution within thesystem 600. The processor may be designed using any of a number of architectures. For example, theprocessor 610 may be a CISC (Complex Instruction Set Computers) processor, a RISC (Reduced Instruction Set Computer) processor, or a MISC (Minimal Instruction Set Computer) processor. - In one implementation, the
processor 610 is a single-threaded processor. In another implementation, theprocessor 610 is a multi-threaded processor. Theprocessor 610 is capable of processing instructions stored in thememory 620 or on thestorage device 630 to display graphical information for a user interface on the input/output device 640. - The
memory 620 stores information within thesystem 600. In one implementation, thememory 620 is a computer-readable medium. In one implementation, thememory 620 is a volatile memory unit. In another implementation, thememory 620 is a non-volatile memory unit. - The
storage device 630 is capable of providing mass storage for thesystem 600. In one implementation, thestorage device 630 is a computer-readable medium. In various different implementations, thestorage device 630 may be a floppy disk device, a hard disk device, an optical disk device, or a tape device. - The input/
output device 640 provides input/output operations for thesystem 600. In one implementation, the input/output device 640 includes a keyboard and/or pointing device. In another implementation, the input/output device 640 includes a display unit for displaying graphical user interfaces. - The features described can be implemented in digital electronic circuitry, or in computer hardware, firmware, software, or in combinations of them. The apparatus can be implemented in a computer program product tangibly embodied in an information carrier, e.g., in a machine-readable storage device for execution by a programmable processor; and method steps can be performed by a programmable processor executing a program of instructions to perform functions of the described implementations by operating on input data and generating output. The described features can be implemented advantageously in one or more computer programs that are executable on a programmable system including at least one programmable processor coupled to receive data and instructions from, and to transmit data and instructions to, a data storage system, at least one input device, and at least one output device. A computer program is a set of instructions that can be used, directly or indirectly, in a computer to perform a certain activity or bring about a certain result. A computer program can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
- Suitable processors for the execution of a program of instructions include, by way of example, both general and special purpose microprocessors, and the sole processor or one of multiple processors of any kind of computer. Generally, a processor will receive instructions and data from a read-only memory or a random access memory or both. The essential elements of a computer are a processor for executing instructions and one or more memories for storing instructions and data. Generally, a computer will also include, or be operatively coupled to communicate with, one or more mass storage devices for storing data files; such devices include magnetic disks, such as internal hard disks and removable disks; magneto-optical disks; and optical disks. Storage devices suitable for tangibly embodying computer program instructions and data include all forms of non-volatile memory, including by way of example semiconductor memory devices, such as EPROM, EEPROM, and flash memory devices; magnetic disks such as internal hard disks and removable disks; magneto-optical disks; and CD-ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, ASICs (application-specific integrated circuits).
- To provide for interaction with a user, the features can be implemented on a computer having a display device such as a CRT (cathode ray tube) or LCD (liquid crystal display) monitor for displaying information to the user and a keyboard and a pointing device such as a mouse or a trackball by which the user can provide input to the computer. Additionally, such activities can be implemented via touchscreen flat-panel displays and other appropriate mechanisms.
- The features can be implemented in a computer system that includes a back-end component, such as a data server, or that includes a middleware component, such as an application server or an Internet server, or that includes a front-end component, such as a client computer having a graphical user interface or an Internet browser, or any combination of them. The components of the system can be connected by any form or medium of digital data communication such as a communication network. Examples of communication networks include a local area network (“LAN”), a wide area network (“WAN”), peer-to-peer networks (having ad-hoc or static members), grid computing infrastructures, and the Internet.
- The computer system can include clients and servers. A client and server are generally remote from each other and typically interact through a network, such as the described one. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
- While this specification contains many specific implementation details, these should not be construed as limitations on the scope of any inventions or of what may be claimed, but rather as descriptions of features specific to particular implementations of particular inventions. Certain features that are described in this specification in the context of separate implementations can also be implemented in combination in a single implementation. Conversely, various features that are described in the context of a single implementation can also be implemented in multiple implementations separately or in any suitable subcombination. Moreover, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.
- Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In certain circumstances, multitasking and parallel processing may be advantageous. Moreover, the separation of various system components in the implementations described above should not be understood as requiring such separation in all implementations, and it should be understood that the described program components and systems can generally be integrated together in a single software product or packaged into multiple software products.
- Thus, particular implementations of the subject matter have been described. Other implementations are within the scope of the following claims. In some cases, the actions recited in the claims can be performed in a different order and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In certain implementations, multitasking and parallel processing may be advantageous. In some implementations, the subject matter may be embodied as methods, systems, devices, and/or as an article or computer program product. The article or computer program product may comprise one or more computer-readable media or computer-readable storage devices, which may be tangible and non-transitory, that include instructions that may be executable by one or more machines such as computer processors.
Claims (20)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15/859,694 US20180121680A1 (en) | 2014-05-23 | 2018-01-01 | Obfuscating web code |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/286,324 US9858440B1 (en) | 2014-05-23 | 2014-05-23 | Encoding of sensitive data |
US15/859,694 US20180121680A1 (en) | 2014-05-23 | 2018-01-01 | Obfuscating web code |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/286,324 Continuation US9858440B1 (en) | 2014-05-23 | 2014-05-23 | Encoding of sensitive data |
Publications (1)
Publication Number | Publication Date |
---|---|
US20180121680A1 true US20180121680A1 (en) | 2018-05-03 |
Family
ID=60971724
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/286,324 Active US9858440B1 (en) | 2014-05-23 | 2014-05-23 | Encoding of sensitive data |
US15/859,694 Abandoned US20180121680A1 (en) | 2014-05-23 | 2018-01-01 | Obfuscating web code |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/286,324 Active US9858440B1 (en) | 2014-05-23 | 2014-05-23 | Encoding of sensitive data |
Country Status (1)
Country | Link |
---|---|
US (2) | US9858440B1 (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10216488B1 (en) | 2016-03-14 | 2019-02-26 | Shape Security, Inc. | Intercepting and injecting calls into operations and objects |
US10230718B2 (en) | 2015-07-07 | 2019-03-12 | Shape Security, Inc. | Split serving of computer code |
CN110263533A (en) * | 2019-04-28 | 2019-09-20 | 清华大学 | Safe web page means of defence |
US10834101B2 (en) | 2016-03-09 | 2020-11-10 | Shape Security, Inc. | Applying bytecode obfuscation techniques to programs written in an interpreted language |
US20210334342A1 (en) * | 2020-04-27 | 2021-10-28 | Imperva, Inc. | Procedural code generation for challenge code |
US11349816B2 (en) | 2016-12-02 | 2022-05-31 | F5, Inc. | Obfuscating source code sent, from a server computer, to a browser on a client computer |
EP4209938A1 (en) * | 2022-01-05 | 2023-07-12 | Irdeto B.V. | Systems, methods, and storage media for creating secured computer code |
US11741197B1 (en) | 2019-10-15 | 2023-08-29 | Shape Security, Inc. | Obfuscating programs using different instruction set architectures |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10657262B1 (en) * | 2014-09-28 | 2020-05-19 | Red Balloon Security, Inc. | Method and apparatus for securing embedded device firmware |
US10311229B1 (en) * | 2015-05-18 | 2019-06-04 | Amazon Technologies, Inc. | Mitigating timing side-channel attacks by obscuring alternatives in code |
US10868665B1 (en) * | 2015-05-18 | 2020-12-15 | Amazon Technologies, Inc. | Mitigating timing side-channel attacks by obscuring accesses to sensitive data |
US10380355B2 (en) * | 2017-03-23 | 2019-08-13 | Microsoft Technology Licensing, Llc | Obfuscation of user content in structured user data files |
US10410014B2 (en) | 2017-03-23 | 2019-09-10 | Microsoft Technology Licensing, Llc | Configurable annotations for privacy-sensitive user content |
US11042634B2 (en) * | 2018-12-21 | 2021-06-22 | Fujitsu Limited | Determining information leakage of computer-readable programs |
US11677783B2 (en) * | 2019-10-25 | 2023-06-13 | Target Brands, Inc. | Analysis of potentially malicious emails |
US20210303662A1 (en) * | 2020-03-31 | 2021-09-30 | Irdeto B.V. | Systems, methods, and storage media for creating secured transformed code from input code using a neural network to obscure a transformation function |
US11611629B2 (en) * | 2020-05-13 | 2023-03-21 | Microsoft Technology Licensing, Llc | Inline frame monitoring |
Citations (77)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5003596A (en) * | 1989-08-17 | 1991-03-26 | Cryptech, Inc. | Method of cryptographically transforming electronic digital data from one form to another |
US5315657A (en) * | 1990-09-28 | 1994-05-24 | Digital Equipment Corporation | Compound principals in access control lists |
US5892899A (en) * | 1996-06-13 | 1999-04-06 | Intel Corporation | Tamper resistant methods and apparatus |
US6006328A (en) * | 1995-07-14 | 1999-12-21 | Christopher N. Drake | Computer software authentication, protection, and security system |
US6088452A (en) * | 1996-03-07 | 2000-07-11 | Northern Telecom Limited | Encoding technique for software and hardware |
US6594761B1 (en) * | 1999-06-09 | 2003-07-15 | Cloakware Corporation | Tamper resistant software encoding |
US20030159063A1 (en) * | 2002-02-07 | 2003-08-21 | Larry Apfelbaum | Automated security threat testing of web pages |
US20030163718A1 (en) * | 2000-04-12 | 2003-08-28 | Johnson Harold J. | Tamper resistant software-mass data encoding |
US6668325B1 (en) * | 1997-06-09 | 2003-12-23 | Intertrust Technologies | Obfuscation techniques for enhancing software security |
US20040101142A1 (en) * | 2001-07-05 | 2004-05-27 | Nasypny Vladimir Vladimirovich | Method and system for an integrated protection system of data distributed processing in computer networks and system for carrying out said method |
US20040139340A1 (en) * | 2000-12-08 | 2004-07-15 | Johnson Harold J | System and method for protecting computer software from a white box attack |
US6779114B1 (en) * | 1999-08-19 | 2004-08-17 | Cloakware Corporation | Tamper resistant software-control flow encoding |
US20050002532A1 (en) * | 2002-01-30 | 2005-01-06 | Yongxin Zhou | System and method of hiding cryptographic private keys |
US20050166191A1 (en) * | 2004-01-28 | 2005-07-28 | Cloakware Corporation | System and method for obscuring bit-wise and two's complement integer computations in software |
US20050183072A1 (en) * | 1999-07-29 | 2005-08-18 | Intertrust Technologies Corporation | Software self-defense systems and methods |
US20060031686A1 (en) * | 1999-09-03 | 2006-02-09 | Purdue Research Foundation | Method and system for tamperproofing software |
US20060034455A1 (en) * | 2004-08-12 | 2006-02-16 | Damgaard Ivan B | Permutation data transform to enhance security |
US20060101047A1 (en) * | 2004-07-29 | 2006-05-11 | Rice John R | Method and system for fortifying software |
US20060195703A1 (en) * | 2005-02-25 | 2006-08-31 | Microsoft Corporation | System and method of iterative code obfuscation |
US20060195588A1 (en) * | 2005-01-25 | 2006-08-31 | Whitehat Security, Inc. | System for detecting vulnerabilities in web applications using client-side application interfaces |
US7103180B1 (en) * | 2001-10-25 | 2006-09-05 | Hewlett-Packard Development Company, L.P. | Method of implementing the data encryption standard with reduced computation |
US20060253687A1 (en) * | 2005-05-09 | 2006-11-09 | Microsoft Corporation | Overlapped code obfuscation |
US20070039048A1 (en) * | 2005-08-12 | 2007-02-15 | Microsoft Corporation | Obfuscating computer code to prevent an attack |
US20070064617A1 (en) * | 2005-09-15 | 2007-03-22 | Reves Joseph P | Traffic anomaly analysis for the detection of aberrant network code |
US20080025496A1 (en) * | 2005-08-01 | 2008-01-31 | Asier Technology Corporation, A Delaware Corporation | Encrypting a plaintext message with authentication |
US20080208560A1 (en) * | 2007-02-23 | 2008-08-28 | Harold Joseph Johnson | System and method of interlocking to protect software - mediated program and device behaviors |
US20080222736A1 (en) * | 2007-03-07 | 2008-09-11 | Trusteer Ltd. | Scrambling HTML to prevent CSRF attacks and transactional crimeware attacks |
US20080229394A1 (en) * | 2006-07-10 | 2008-09-18 | Sci Group | Method and System For Securely Protecting Data During Software Application Usage |
US7472413B1 (en) * | 2003-08-11 | 2008-12-30 | F5 Networks, Inc. | Security for WAP servers |
US7506177B2 (en) * | 2001-05-24 | 2009-03-17 | Cloakware Corporation | Tamper resistant software encoding and analysis |
US20090077383A1 (en) * | 2007-08-06 | 2009-03-19 | De Monseignat Bernard | System and method for authentication, data transfer, and protection against phishing |
US20090119515A1 (en) * | 2005-10-28 | 2009-05-07 | Matsushita Electric Industrial Co., Ltd. | Obfuscation evaluation method and obfuscation method |
US20090193513A1 (en) * | 2008-01-26 | 2009-07-30 | Puneet Agarwal | Policy driven fine grain url encoding mechanism for ssl vpn clientless access |
US7580521B1 (en) * | 2003-06-25 | 2009-08-25 | Voltage Security, Inc. | Identity-based-encryption system with hidden public key attributes |
US20090235089A1 (en) * | 2008-03-12 | 2009-09-17 | Mathieu Ciet | Computer object code obfuscation using boot installation |
US20090249492A1 (en) * | 2006-09-21 | 2009-10-01 | Hans Martin Boesgaard Sorensen | Fabrication of computer executable program files from source code |
US20090254572A1 (en) * | 2007-01-05 | 2009-10-08 | Redlich Ron M | Digital information infrastructure and method |
US20090307500A1 (en) * | 2006-02-06 | 2009-12-10 | Taichi Sato | Program obfuscator |
US20100058301A1 (en) * | 2008-08-26 | 2010-03-04 | Apple Inc. | System and method for branch extraction obfuscation |
US20100083072A1 (en) * | 2008-09-30 | 2010-04-01 | Freescale Semiconductor, Inc. | Data interleaver |
US20100107245A1 (en) * | 2008-10-29 | 2010-04-29 | Microsoft Corporation | Tamper-tolerant programs |
US20100186089A1 (en) * | 2009-01-22 | 2010-07-22 | International Business Machines Corporation | Method and system for protecting cross-domain interaction of a web application on an unmodified browser |
US20100257354A1 (en) * | 2007-09-07 | 2010-10-07 | Dis-Ent, Llc | Software based multi-channel polymorphic data obfuscation |
US20100281459A1 (en) * | 2009-05-01 | 2010-11-04 | Apple Inc. | Systems, methods, and computer-readable media for fertilizing machine-executable code |
US20110129089A1 (en) * | 2009-11-30 | 2011-06-02 | Electronics And Telecommunications Research Institute | Method and apparatus for partially encoding/decoding data for commitment service and method of using encoded data |
US20110131416A1 (en) * | 2009-11-30 | 2011-06-02 | James Paul Schneider | Multifactor validation of requests to thw art dynamic cross-site attacks |
US20110167407A1 (en) * | 2010-01-06 | 2011-07-07 | Apple Inc. | System and method for software data reference obfuscation |
US20110302424A1 (en) * | 2001-06-13 | 2011-12-08 | Intertrust Technologies Corp. | Software Self-Checking Systems and Methods |
US20120022942A1 (en) * | 2010-04-01 | 2012-01-26 | Lee Hahn Holloway | Internet-based proxy service to modify internet responses |
US8185749B2 (en) * | 2008-09-02 | 2012-05-22 | Apple Inc. | System and method for revising boolean and arithmetic operations |
US8266243B1 (en) * | 2010-03-30 | 2012-09-11 | Amazon Technologies, Inc. | Feedback mechanisms providing contextual information |
US8347398B1 (en) * | 2009-09-23 | 2013-01-01 | Savvystuff Property Trust | Selected text obfuscation and encryption in a local, network and cloud computing environment |
US20130046995A1 (en) * | 2010-02-23 | 2013-02-21 | David Movshovitz | Method and computer program product for order preserving symbol based encryption |
US8393003B2 (en) * | 2006-12-21 | 2013-03-05 | Telefonaktiebolaget L M Ericsson (Publ) | Obfuscating computer program code |
US8392910B1 (en) * | 2007-04-10 | 2013-03-05 | AT & T Intellectual Property II, LLP | Stochastic method for program security using deferred linking |
US20130061323A1 (en) * | 2008-04-23 | 2013-03-07 | Trusted Knight Corporation | System and method for protecting against malware utilizing key loggers |
US20130067225A1 (en) * | 2008-09-08 | 2013-03-14 | Ofer Shochet | Appliance, system, method and corresponding software components for encrypting and processing data |
US20130179985A1 (en) * | 2012-01-05 | 2013-07-11 | Vmware, Inc. | Securing user data in cloud computing environments |
US20130232578A1 (en) * | 2012-03-02 | 2013-09-05 | Apple Inc. | Method and apparatus for obfuscating program source codes |
US8615804B2 (en) * | 2010-02-18 | 2013-12-24 | Polytechnic Institute Of New York University | Complementary character encoding for preventing input injection in web applications |
US20140013427A1 (en) * | 2011-03-24 | 2014-01-09 | Irdeto B.V. | System And Method Providing Dependency Networks Throughout Applications For Attack Resistance |
US20140165197A1 (en) * | 2012-12-06 | 2014-06-12 | Empire Technology Development, Llc | Malware attack prevention using block code permutation |
US8762705B2 (en) * | 2008-07-24 | 2014-06-24 | Alibaba Group Holding Limited | System and method for preventing web crawler access |
US20140282872A1 (en) * | 2013-03-15 | 2014-09-18 | Shape Security Inc. | Stateless web content anti-automation |
US20140283069A1 (en) * | 2013-03-15 | 2014-09-18 | Shape Security Inc. | Protecting against the introduction of alien content |
US20140281535A1 (en) * | 2013-03-15 | 2014-09-18 | Munibonsoftware.com, LLC | Apparatus and Method for Preventing Information from Being Extracted from a Webpage |
US20150039962A1 (en) * | 2010-09-10 | 2015-02-05 | John P. Fonseka | Methods, apparatus, and systems for coding with constrained interleaving |
US20150180509A9 (en) * | 2010-09-10 | 2015-06-25 | John P. Fonseka | Methods, apparatus, and systems for coding with constrained interleaving |
US20150350243A1 (en) * | 2013-03-15 | 2015-12-03 | Shape Security Inc. | Safe Intelligent Content Modification |
US9241004B1 (en) * | 2014-03-11 | 2016-01-19 | Trend Micro Incorporated | Alteration of web documents for protection against web-injection attacks |
US9270647B2 (en) * | 2013-12-06 | 2016-02-23 | Shape Security, Inc. | Client/server security by an intermediary rendering modified in-memory objects |
US20170041341A1 (en) * | 2014-05-23 | 2017-02-09 | Shape Security, Inc. | Polymorphic Treatment of Data Entered At Clients |
US9582666B1 (en) * | 2015-05-07 | 2017-02-28 | Shape Security, Inc. | Computer system for improved security of server computers interacting with client computers |
US9602543B2 (en) * | 2014-09-09 | 2017-03-21 | Shape Security, Inc. | Client/server polymorphism using polymorphic hooks |
US9712561B2 (en) * | 2014-01-20 | 2017-07-18 | Shape Security, Inc. | Intercepting and supervising, in a runtime environment, calls to one or more objects in a web page |
US10122747B2 (en) * | 2013-12-06 | 2018-11-06 | Lookout, Inc. | Response generation after distributed monitoring and evaluation of multiple devices |
US10216488B1 (en) * | 2016-03-14 | 2019-02-26 | Shape Security, Inc. | Intercepting and injecting calls into operations and objects |
Family Cites Families (70)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA2212574C (en) | 1995-02-13 | 2010-02-02 | Electronic Publishing Resources, Inc. | Systems and methods for secure transaction management and electronic rights protection |
US6865735B1 (en) | 1997-10-07 | 2005-03-08 | University Of Washington | Process for rewriting executable content on a network server or desktop machine in order to enforce site specific properties |
SE512672C2 (en) | 1998-06-12 | 2000-04-17 | Ericsson Telefon Ab L M | Procedure and system for transferring a cookie |
US6697948B1 (en) | 1999-05-05 | 2004-02-24 | Michael O. Rabin | Methods and apparatus for protecting information |
CA2447451C (en) | 2000-05-12 | 2013-02-12 | Xtreamlok Pty. Ltd. | Information security method and system |
US6938170B1 (en) | 2000-07-17 | 2005-08-30 | International Business Machines Corporation | System and method for preventing automated crawler access to web-based data sources using a dynamic data transcoding scheme |
US7117239B1 (en) | 2000-07-28 | 2006-10-03 | Axeda Corporation | Reporting the state of an apparatus to a remote computer |
WO2002088951A1 (en) | 2001-04-26 | 2002-11-07 | Telefonaktiebolaget Lm Ericsson (Publ) | Stateless server |
WO2002093393A1 (en) | 2001-05-11 | 2002-11-21 | Sap Portals, Inc. | Browser with messaging capability and other persistent connections |
US7028305B2 (en) | 2001-05-16 | 2006-04-11 | Softricity, Inc. | Operating system abstraction and protection layer |
US7010779B2 (en) | 2001-08-16 | 2006-03-07 | Knowledge Dynamics, Inc. | Parser, code generator, and data calculation and transformation engine for spreadsheet calculations |
US20040162994A1 (en) | 2002-05-13 | 2004-08-19 | Sandia National Laboratories | Method and apparatus for configurable communication network defenses |
US7117429B2 (en) | 2002-06-12 | 2006-10-03 | Oracle International Corporation | Methods and systems for managing styles electronic documents |
US7333072B2 (en) | 2003-03-24 | 2008-02-19 | Semiconductor Energy Laboratory Co., Ltd. | Thin film integrated circuit device |
US8510571B1 (en) | 2003-03-24 | 2013-08-13 | Hoi Chang | System and method for inserting security mechanisms into a software program |
US7500099B1 (en) | 2003-05-16 | 2009-03-03 | Microsoft Corporation | Method for mitigating web-based “one-click” attacks |
US7735144B2 (en) | 2003-05-16 | 2010-06-08 | Adobe Systems Incorporated | Document modification detection and prevention |
WO2004109532A1 (en) | 2003-06-05 | 2004-12-16 | Cubicice (Pty) Ltd | A method of collecting data regarding a plurality of web pages visited by at least one user |
US8806187B1 (en) | 2009-12-03 | 2014-08-12 | Google Inc. | Protecting browser-viewed content from piracy |
US7624449B1 (en) | 2004-01-22 | 2009-11-24 | Symantec Corporation | Countering polymorphic malicious computer code through code optimization |
US7475341B2 (en) | 2004-06-15 | 2009-01-06 | At&T Intellectual Property I, L.P. | Converting the format of a portion of an electronic document |
US7480385B2 (en) | 2004-11-05 | 2009-01-20 | Cable Television Laboratories, Inc. | Hierarchical encryption key system for securing digital media |
US7707223B2 (en) | 2005-04-28 | 2010-04-27 | Cisco Technology, Inc. | Client-side java content transformation |
US7770185B2 (en) | 2005-09-26 | 2010-08-03 | Bea Systems, Inc. | Interceptor method and system for web services for remote portlets |
US8170020B2 (en) | 2005-12-08 | 2012-05-01 | Microsoft Corporation | Leveraging active firewalls for network intrusion detection and retardation of attack |
GB0620855D0 (en) | 2006-10-19 | 2006-11-29 | Dovetail Software Corp Ltd | Data processing apparatus and method |
JP5133973B2 (en) | 2007-01-18 | 2013-01-30 | パナソニック株式会社 | Obfuscation support device, obfuscation support method, program, and integrated circuit |
US8290800B2 (en) | 2007-01-30 | 2012-10-16 | Google Inc. | Probabilistic inference of site demographics from aggregate user internet usage and source demographic information |
WO2008095018A2 (en) | 2007-01-31 | 2008-08-07 | Omniture, Inc. | Page grouping for site traffic analysis reports |
WO2008130946A2 (en) | 2007-04-17 | 2008-10-30 | Kenneth Tola | Unobtrusive methods and systems for collecting information transmitted over a network |
US8527757B2 (en) | 2007-06-22 | 2013-09-03 | Gemalto Sa | Method of preventing web browser extensions from hijacking user information |
US7941382B2 (en) | 2007-10-12 | 2011-05-10 | Microsoft Corporation | Method of classifying and active learning that ranks entries based on multiple scores, presents entries to human analysts, and detects and/or prevents malicious behavior |
US8260845B1 (en) | 2007-11-21 | 2012-09-04 | Appcelerator, Inc. | System and method for auto-generating JavaScript proxies and meta-proxies |
US8347396B2 (en) | 2007-11-30 | 2013-01-01 | International Business Machines Corporation | Protect sensitive content for human-only consumption |
US9317255B2 (en) | 2008-03-28 | 2016-04-19 | Microsoft Technology Licensing, LCC | Automatic code transformation with state transformer monads |
CA2630388A1 (en) | 2008-05-05 | 2009-11-05 | Nima Sharifmehr | Apparatus and method to prevent man in the middle attack |
KR100987354B1 (en) | 2008-05-22 | 2010-10-12 | 주식회사 이베이지마켓 | System for checking false code in website and Method thereof |
US9405555B2 (en) | 2008-05-23 | 2016-08-02 | Microsoft Technology Licensing, Llc | Automated code splitting and pre-fetching for improving responsiveness of browser-based applications |
KR101027928B1 (en) | 2008-07-23 | 2011-04-12 | 한국전자통신연구원 | Apparatus and Method for detecting obfuscated web page |
CN102217225B (en) | 2008-10-03 | 2014-04-02 | 杰出网络公司 | Content delivery network encryption |
US8020193B2 (en) | 2008-10-20 | 2011-09-13 | International Business Machines Corporation | Systems and methods for protecting web based applications from cross site request forgery attacks |
US8434068B2 (en) | 2008-10-23 | 2013-04-30 | XMOS Ltd. | Development system |
US8225401B2 (en) | 2008-12-18 | 2012-07-17 | Symantec Corporation | Methods and systems for detecting man-in-the-browser attacks |
CN101482882A (en) | 2009-02-17 | 2009-07-15 | 阿里巴巴集团控股有限公司 | Method and system for cross-domain treatment of COOKIE |
US9311425B2 (en) | 2009-03-31 | 2016-04-12 | Qualcomm Incorporated | Rendering a page using a previously stored DOM associated with a different page |
US8332952B2 (en) | 2009-05-22 | 2012-12-11 | Microsoft Corporation | Time window based canary solutions for browser security |
US8527774B2 (en) | 2009-05-28 | 2013-09-03 | Kaazing Corporation | System and methods for providing stateless security management for web applications using non-HTTP communications protocols |
US8924943B2 (en) | 2009-07-17 | 2014-12-30 | Ebay Inc. | Browser emulator system |
US11102325B2 (en) | 2009-10-23 | 2021-08-24 | Moov Corporation | Configurable and dynamic transformation of web content |
US8539224B2 (en) | 2009-11-05 | 2013-09-17 | International Business Machines Corporation | Obscuring form data through obfuscation |
US8353037B2 (en) | 2009-12-03 | 2013-01-08 | International Business Machines Corporation | Mitigating malicious file propagation with progressive identifiers |
US8660976B2 (en) | 2010-01-20 | 2014-02-25 | Microsoft Corporation | Web content rewriting, including responses |
US20110255689A1 (en) | 2010-04-15 | 2011-10-20 | Lsi Corporation | Multiple-mode cryptographic module usable with memory controllers |
US8739150B2 (en) | 2010-05-28 | 2014-05-27 | Smartshift Gmbh | Systems and methods for dynamically replacing code objects via conditional pattern templates |
US8914879B2 (en) | 2010-06-11 | 2014-12-16 | Trustwave Holdings, Inc. | System and method for improving coverage for web code |
US20120124372A1 (en) | 2010-10-13 | 2012-05-17 | Akamai Technologies, Inc. | Protecting Websites and Website Users By Obscuring URLs |
US8631091B2 (en) | 2010-10-15 | 2014-01-14 | Northeastern University | Content distribution network using a web browser and locally stored content to directly exchange content between users |
US8751822B2 (en) | 2010-12-20 | 2014-06-10 | Motorola Mobility Llc | Cryptography using quasigroups |
AU2011200413B1 (en) | 2011-02-01 | 2011-09-15 | Symbiotic Technologies Pty Ltd | Methods and Systems to Detect Attacks on Internet Transactions |
US8590041B2 (en) | 2011-11-28 | 2013-11-19 | Mcafee, Inc. | Application sandboxing using a dynamic optimization framework |
US8904279B1 (en) | 2011-12-07 | 2014-12-02 | Amazon Technologies, Inc. | Inhibiting automated extraction of data from network pages |
WO2013091709A1 (en) | 2011-12-22 | 2013-06-27 | Fundació Privada Barcelona Digital Centre Tecnologic | Method and apparatus for real-time dynamic transformation of the code of a web document |
US10049168B2 (en) | 2012-01-31 | 2018-08-14 | Openwave Mobility, Inc. | Systems and methods for modifying webpage data |
US9111090B2 (en) | 2012-04-02 | 2015-08-18 | Trusteer, Ltd. | Detection of phishing attempts |
US20140089786A1 (en) | 2012-06-01 | 2014-03-27 | Atiq Hashmi | Automated Processor For Web Content To Mobile-Optimized Content Transformation |
US8595613B1 (en) | 2012-07-26 | 2013-11-26 | Viasat Inc. | Page element identifier pre-classification for user interface behavior in a communications system |
US8806627B1 (en) | 2012-12-17 | 2014-08-12 | Emc Corporation | Content randomization for thwarting malicious software attacks |
US9294502B1 (en) | 2013-12-06 | 2016-03-22 | Radware, Ltd. | Method and system for detection of malicious bots |
GB201415860D0 (en) | 2014-09-08 | 2014-10-22 | User Replay Ltd | Systems and methods for recording and recreating interactive user-sessions involving an on-line server |
WO2017156158A1 (en) | 2016-03-09 | 2017-09-14 | Shape Security, Inc. | Applying bytecode obfuscation techniques to programs written in an interpreted language |
-
2014
- 2014-05-23 US US14/286,324 patent/US9858440B1/en active Active
-
2018
- 2018-01-01 US US15/859,694 patent/US20180121680A1/en not_active Abandoned
Patent Citations (108)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5003596A (en) * | 1989-08-17 | 1991-03-26 | Cryptech, Inc. | Method of cryptographically transforming electronic digital data from one form to another |
US5315657A (en) * | 1990-09-28 | 1994-05-24 | Digital Equipment Corporation | Compound principals in access control lists |
US6006328A (en) * | 1995-07-14 | 1999-12-21 | Christopher N. Drake | Computer software authentication, protection, and security system |
US6088452A (en) * | 1996-03-07 | 2000-07-11 | Northern Telecom Limited | Encoding technique for software and hardware |
US5892899A (en) * | 1996-06-13 | 1999-04-06 | Intel Corporation | Tamper resistant methods and apparatus |
US6668325B1 (en) * | 1997-06-09 | 2003-12-23 | Intertrust Technologies | Obfuscation techniques for enhancing software security |
US6594761B1 (en) * | 1999-06-09 | 2003-07-15 | Cloakware Corporation | Tamper resistant software encoding |
US6842862B2 (en) * | 1999-06-09 | 2005-01-11 | Cloakware Corporation | Tamper resistant software encoding |
US7779394B2 (en) * | 1999-07-29 | 2010-08-17 | Intertrust Technologies Corporation | Software self-defense systems and methods |
US20150278491A1 (en) * | 1999-07-29 | 2015-10-01 | Intertrust Technologies Corporation | Software self-defense systems and methods |
US20070234070A1 (en) * | 1999-07-29 | 2007-10-04 | Intertrust Technologies Corp. | Software self-defense systems and methods |
US9064099B2 (en) * | 1999-07-29 | 2015-06-23 | Intertrust Technologies Corporation | Software self-defense systems and methods |
US7779270B2 (en) * | 1999-07-29 | 2010-08-17 | Intertrust Technologies Corporation | Software self-defense systems and methods |
US20130232343A1 (en) * | 1999-07-29 | 2013-09-05 | Intertrust Technologies Corporation | Software self-defense systems and methods |
US7430670B1 (en) * | 1999-07-29 | 2008-09-30 | Intertrust Technologies Corp. | Software self-defense systems and methods |
US20050183072A1 (en) * | 1999-07-29 | 2005-08-18 | Intertrust Technologies Corporation | Software self-defense systems and methods |
US20050204348A1 (en) * | 1999-07-29 | 2005-09-15 | Inter Trust Technologies Corporation | Software self-defense systems and methods |
US20050210275A1 (en) * | 1999-07-29 | 2005-09-22 | Intertrust Technologies Corporation | Software self-defense systems and methods |
US7823135B2 (en) * | 1999-07-29 | 2010-10-26 | Intertrust Technologies Corporation | Software self-defense systems and methods |
US20110035733A1 (en) * | 1999-07-29 | 2011-02-10 | Intertrust Technologies Corp. | Software Self-Defense Systems and Methods |
US8387022B2 (en) * | 1999-07-29 | 2013-02-26 | Intertrust Technologies Corp. | Software self-defense systems and methods |
US6779114B1 (en) * | 1999-08-19 | 2004-08-17 | Cloakware Corporation | Tamper resistant software-control flow encoding |
US20060031686A1 (en) * | 1999-09-03 | 2006-02-09 | Purdue Research Foundation | Method and system for tamperproofing software |
US20030163718A1 (en) * | 2000-04-12 | 2003-08-28 | Johnson Harold J. | Tamper resistant software-mass data encoding |
US20040139340A1 (en) * | 2000-12-08 | 2004-07-15 | Johnson Harold J | System and method for protecting computer software from a white box attack |
US7506177B2 (en) * | 2001-05-24 | 2009-03-17 | Cloakware Corporation | Tamper resistant software encoding and analysis |
US20110302424A1 (en) * | 2001-06-13 | 2011-12-08 | Intertrust Technologies Corp. | Software Self-Checking Systems and Methods |
US20040101142A1 (en) * | 2001-07-05 | 2004-05-27 | Nasypny Vladimir Vladimirovich | Method and system for an integrated protection system of data distributed processing in computer networks and system for carrying out said method |
US7103180B1 (en) * | 2001-10-25 | 2006-09-05 | Hewlett-Packard Development Company, L.P. | Method of implementing the data encryption standard with reduced computation |
US20050002532A1 (en) * | 2002-01-30 | 2005-01-06 | Yongxin Zhou | System and method of hiding cryptographic private keys |
US20030159063A1 (en) * | 2002-02-07 | 2003-08-21 | Larry Apfelbaum | Automated security threat testing of web pages |
US7580521B1 (en) * | 2003-06-25 | 2009-08-25 | Voltage Security, Inc. | Identity-based-encryption system with hidden public key attributes |
US7961879B1 (en) * | 2003-06-25 | 2011-06-14 | Voltage Security, Inc. | Identity-based-encryption system with hidden public key attributes |
US7472413B1 (en) * | 2003-08-11 | 2008-12-30 | F5 Networks, Inc. | Security for WAP servers |
US20050166191A1 (en) * | 2004-01-28 | 2005-07-28 | Cloakware Corporation | System and method for obscuring bit-wise and two's complement integer computations in software |
US20060101047A1 (en) * | 2004-07-29 | 2006-05-11 | Rice John R | Method and system for fortifying software |
US20060034455A1 (en) * | 2004-08-12 | 2006-02-16 | Damgaard Ivan B | Permutation data transform to enhance security |
US8077861B2 (en) * | 2004-08-12 | 2011-12-13 | Cmla, Llc | Permutation data transform to enhance security |
US20060195588A1 (en) * | 2005-01-25 | 2006-08-31 | Whitehat Security, Inc. | System for detecting vulnerabilities in web applications using client-side application interfaces |
US7587616B2 (en) * | 2005-02-25 | 2009-09-08 | Microsoft Corporation | System and method of iterative code obfuscation |
US20060195703A1 (en) * | 2005-02-25 | 2006-08-31 | Microsoft Corporation | System and method of iterative code obfuscation |
US20060253687A1 (en) * | 2005-05-09 | 2006-11-09 | Microsoft Corporation | Overlapped code obfuscation |
US20080025496A1 (en) * | 2005-08-01 | 2008-01-31 | Asier Technology Corporation, A Delaware Corporation | Encrypting a plaintext message with authentication |
US20100172494A1 (en) * | 2005-08-01 | 2010-07-08 | Kevin Martin Henson | Encrypting a plaintext message with authenticaion |
US7620987B2 (en) * | 2005-08-12 | 2009-11-17 | Microsoft Corporation | Obfuscating computer code to prevent an attack |
US20070039048A1 (en) * | 2005-08-12 | 2007-02-15 | Microsoft Corporation | Obfuscating computer code to prevent an attack |
US20070064617A1 (en) * | 2005-09-15 | 2007-03-22 | Reves Joseph P | Traffic anomaly analysis for the detection of aberrant network code |
US20090119515A1 (en) * | 2005-10-28 | 2009-05-07 | Matsushita Electric Industrial Co., Ltd. | Obfuscation evaluation method and obfuscation method |
US20090307500A1 (en) * | 2006-02-06 | 2009-12-10 | Taichi Sato | Program obfuscator |
US20080229394A1 (en) * | 2006-07-10 | 2008-09-18 | Sci Group | Method and System For Securely Protecting Data During Software Application Usage |
US20090249492A1 (en) * | 2006-09-21 | 2009-10-01 | Hans Martin Boesgaard Sorensen | Fabrication of computer executable program files from source code |
US8393003B2 (en) * | 2006-12-21 | 2013-03-05 | Telefonaktiebolaget L M Ericsson (Publ) | Obfuscating computer program code |
US20090254572A1 (en) * | 2007-01-05 | 2009-10-08 | Redlich Ron M | Digital information infrastructure and method |
US8752032B2 (en) * | 2007-02-23 | 2014-06-10 | Irdeto Canada Corporation | System and method of interlocking to protect software-mediated program and device behaviours |
US20150213239A1 (en) * | 2007-02-23 | 2015-07-30 | Irdeto Canada Corporation | System and method of interlocking to protect software-mediated program and device behaviours |
US20080208560A1 (en) * | 2007-02-23 | 2008-08-28 | Harold Joseph Johnson | System and method of interlocking to protect software - mediated program and device behaviors |
US20150074803A1 (en) * | 2007-02-23 | 2015-03-12 | Irdeto Canada Corportation | System and method of interlocking to protect software-mediated program and device behaviours |
US8161463B2 (en) * | 2007-02-23 | 2012-04-17 | Irdeto Canada Corporation | System and method of interlocking to protect software—mediated program and device behaviors |
US20080216051A1 (en) * | 2007-02-23 | 2008-09-04 | Harold Joseph Johnson | System and method of interlocking to protect software-mediated program and device behaviours |
US20080222736A1 (en) * | 2007-03-07 | 2008-09-11 | Trusteer Ltd. | Scrambling HTML to prevent CSRF attacks and transactional crimeware attacks |
US8392910B1 (en) * | 2007-04-10 | 2013-03-05 | AT & T Intellectual Property II, LLP | Stochastic method for program security using deferred linking |
US20130152071A1 (en) * | 2007-04-10 | 2013-06-13 | At & T Intellectual Property Ii, L.P. | Stochastic Method for Program Security Using Deferred Linking |
US20090077383A1 (en) * | 2007-08-06 | 2009-03-19 | De Monseignat Bernard | System and method for authentication, data transfer, and protection against phishing |
US20100257354A1 (en) * | 2007-09-07 | 2010-10-07 | Dis-Ent, Llc | Software based multi-channel polymorphic data obfuscation |
US20090193513A1 (en) * | 2008-01-26 | 2009-07-30 | Puneet Agarwal | Policy driven fine grain url encoding mechanism for ssl vpn clientless access |
US20090235089A1 (en) * | 2008-03-12 | 2009-09-17 | Mathieu Ciet | Computer object code obfuscation using boot installation |
US20130061323A1 (en) * | 2008-04-23 | 2013-03-07 | Trusted Knight Corporation | System and method for protecting against malware utilizing key loggers |
US8762705B2 (en) * | 2008-07-24 | 2014-06-24 | Alibaba Group Holding Limited | System and method for preventing web crawler access |
US20150195305A1 (en) * | 2008-07-24 | 2015-07-09 | Alibaba Group Holding Limited | System and method for preventing web crawler access |
US20100058301A1 (en) * | 2008-08-26 | 2010-03-04 | Apple Inc. | System and method for branch extraction obfuscation |
US8185749B2 (en) * | 2008-09-02 | 2012-05-22 | Apple Inc. | System and method for revising boolean and arithmetic operations |
US20130067225A1 (en) * | 2008-09-08 | 2013-03-14 | Ofer Shochet | Appliance, system, method and corresponding software components for encrypting and processing data |
US20100083072A1 (en) * | 2008-09-30 | 2010-04-01 | Freescale Semiconductor, Inc. | Data interleaver |
US20100107245A1 (en) * | 2008-10-29 | 2010-04-29 | Microsoft Corporation | Tamper-tolerant programs |
US20100186089A1 (en) * | 2009-01-22 | 2010-07-22 | International Business Machines Corporation | Method and system for protecting cross-domain interaction of a web application on an unmodified browser |
US20100281459A1 (en) * | 2009-05-01 | 2010-11-04 | Apple Inc. | Systems, methods, and computer-readable media for fertilizing machine-executable code |
US8347398B1 (en) * | 2009-09-23 | 2013-01-01 | Savvystuff Property Trust | Selected text obfuscation and encryption in a local, network and cloud computing environment |
US20110131416A1 (en) * | 2009-11-30 | 2011-06-02 | James Paul Schneider | Multifactor validation of requests to thw art dynamic cross-site attacks |
US20110129089A1 (en) * | 2009-11-30 | 2011-06-02 | Electronics And Telecommunications Research Institute | Method and apparatus for partially encoding/decoding data for commitment service and method of using encoded data |
US20110167407A1 (en) * | 2010-01-06 | 2011-07-07 | Apple Inc. | System and method for software data reference obfuscation |
US8615804B2 (en) * | 2010-02-18 | 2013-12-24 | Polytechnic Institute Of New York University | Complementary character encoding for preventing input injection in web applications |
US20130046995A1 (en) * | 2010-02-23 | 2013-02-21 | David Movshovitz | Method and computer program product for order preserving symbol based encryption |
US8266243B1 (en) * | 2010-03-30 | 2012-09-11 | Amazon Technologies, Inc. | Feedback mechanisms providing contextual information |
US20120022942A1 (en) * | 2010-04-01 | 2012-01-26 | Lee Hahn Holloway | Internet-based proxy service to modify internet responses |
US20150180509A9 (en) * | 2010-09-10 | 2015-06-25 | John P. Fonseka | Methods, apparatus, and systems for coding with constrained interleaving |
US20150039962A1 (en) * | 2010-09-10 | 2015-02-05 | John P. Fonseka | Methods, apparatus, and systems for coding with constrained interleaving |
US20140013427A1 (en) * | 2011-03-24 | 2014-01-09 | Irdeto B.V. | System And Method Providing Dependency Networks Throughout Applications For Attack Resistance |
US20130179985A1 (en) * | 2012-01-05 | 2013-07-11 | Vmware, Inc. | Securing user data in cloud computing environments |
US20130232578A1 (en) * | 2012-03-02 | 2013-09-05 | Apple Inc. | Method and apparatus for obfuscating program source codes |
US8661549B2 (en) * | 2012-03-02 | 2014-02-25 | Apple Inc. | Method and apparatus for obfuscating program source codes |
US20140165197A1 (en) * | 2012-12-06 | 2014-06-12 | Empire Technology Development, Llc | Malware attack prevention using block code permutation |
US20140281535A1 (en) * | 2013-03-15 | 2014-09-18 | Munibonsoftware.com, LLC | Apparatus and Method for Preventing Information from Being Extracted from a Webpage |
US20180041527A1 (en) * | 2013-03-15 | 2018-02-08 | Shape Security, Inc. | Using instrumentation code to detect bots or malware |
US20140282872A1 (en) * | 2013-03-15 | 2014-09-18 | Shape Security Inc. | Stateless web content anti-automation |
US9178908B2 (en) * | 2013-03-15 | 2015-11-03 | Shape Security, Inc. | Protecting against the introduction of alien content |
US20150350243A1 (en) * | 2013-03-15 | 2015-12-03 | Shape Security Inc. | Safe Intelligent Content Modification |
US20140283069A1 (en) * | 2013-03-15 | 2014-09-18 | Shape Security Inc. | Protecting against the introduction of alien content |
US20160197945A1 (en) * | 2013-03-15 | 2016-07-07 | Shape Security, Inc. | Protecting against the introduction of alien content |
US20190243971A1 (en) * | 2013-03-15 | 2019-08-08 | Shape Security, Inc. | Using instrumentation code to detect bots or malware |
US9270647B2 (en) * | 2013-12-06 | 2016-02-23 | Shape Security, Inc. | Client/server security by an intermediary rendering modified in-memory objects |
US10122747B2 (en) * | 2013-12-06 | 2018-11-06 | Lookout, Inc. | Response generation after distributed monitoring and evaluation of multiple devices |
US10027628B2 (en) * | 2013-12-06 | 2018-07-17 | Shape Security, Inc. | Client/server security by an intermediary rendering modified in-memory objects |
US9712561B2 (en) * | 2014-01-20 | 2017-07-18 | Shape Security, Inc. | Intercepting and supervising, in a runtime environment, calls to one or more objects in a web page |
US9241004B1 (en) * | 2014-03-11 | 2016-01-19 | Trend Micro Incorporated | Alteration of web documents for protection against web-injection attacks |
US20170041341A1 (en) * | 2014-05-23 | 2017-02-09 | Shape Security, Inc. | Polymorphic Treatment of Data Entered At Clients |
US9602543B2 (en) * | 2014-09-09 | 2017-03-21 | Shape Security, Inc. | Client/server polymorphism using polymorphic hooks |
US9582666B1 (en) * | 2015-05-07 | 2017-02-28 | Shape Security, Inc. | Computer system for improved security of server computers interacting with client computers |
US10216488B1 (en) * | 2016-03-14 | 2019-02-26 | Shape Security, Inc. | Intercepting and injecting calls into operations and objects |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10230718B2 (en) | 2015-07-07 | 2019-03-12 | Shape Security, Inc. | Split serving of computer code |
US10834101B2 (en) | 2016-03-09 | 2020-11-10 | Shape Security, Inc. | Applying bytecode obfuscation techniques to programs written in an interpreted language |
US10216488B1 (en) | 2016-03-14 | 2019-02-26 | Shape Security, Inc. | Intercepting and injecting calls into operations and objects |
US11349816B2 (en) | 2016-12-02 | 2022-05-31 | F5, Inc. | Obfuscating source code sent, from a server computer, to a browser on a client computer |
CN110263533A (en) * | 2019-04-28 | 2019-09-20 | 清华大学 | Safe web page means of defence |
US11741197B1 (en) | 2019-10-15 | 2023-08-29 | Shape Security, Inc. | Obfuscating programs using different instruction set architectures |
US20210334342A1 (en) * | 2020-04-27 | 2021-10-28 | Imperva, Inc. | Procedural code generation for challenge code |
US11748460B2 (en) * | 2020-04-27 | 2023-09-05 | Imperva, Inc. | Procedural code generation for challenge code |
EP4209938A1 (en) * | 2022-01-05 | 2023-07-12 | Irdeto B.V. | Systems, methods, and storage media for creating secured computer code |
Also Published As
Publication number | Publication date |
---|---|
US9858440B1 (en) | 2018-01-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20180121680A1 (en) | Obfuscating web code | |
US11297097B2 (en) | Code modification for detecting abnormal activity | |
US9973519B2 (en) | Protecting a server computer by detecting the identity of a browser on a client computer | |
US10382482B2 (en) | Polymorphic obfuscation of executable code | |
US10193909B2 (en) | Using instrumentation code to detect bots or malware | |
US10205742B2 (en) | Stateless web content anti-automation | |
US20190141064A1 (en) | Detecting attacks against a server computer based on characterizing user interactions with the client computing device | |
US9489526B1 (en) | Pre-analyzing served content | |
US9584534B1 (en) | Dynamic field re-rendering | |
US9325734B1 (en) | Distributed polymorphic transformation of served content | |
US9112900B1 (en) | Distributed polymorphic transformation of served content | |
US12058170B2 (en) | Code modification for detecting abnormal activity |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
AS | Assignment |
Owner name: SHAPE SECURITY, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:WANG, XINRAN;ZHAO, YAO;REEL/FRAME:050910/0270 Effective date: 20140522 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |