Nothing Special   »   [go: up one dir, main page]

Academia.eduAcademia.edu

Detecting Telephone-based Social Engineering Attacks using Scam Signatures

Proceedings of the 2021 ACM Workshop on Security and Privacy Analytics

Session 2: Security Analytics IWSPA’21, April 28, 2021, Virtual Event, USA Detecting Telephone-based Social Engineering Attacks using Scam Signatures Ali Derakhshan aderakh1@uci.edu University of California, Irvine Ian G. Harris harris@ics.uci.edu University of California, Irvine Mitra Behzadi mbehzadi@uci.edu University of California, Irvine with the advent of computer-mediated communication, it becomes possible to protect users from these attacks. Email phishing has been shown to be an effective attack over the years, consistently deceiving a broad range of people [23]. Attackers increasingly seek financial and sometimes political benefit by stealing personal information from individuals and organizations [24]. The Verizon 2019 Data Breach Investigations Report [37] states that 32% of all breaches included phishing and 78% of all cyber-espionage which involved state-affiliated actors. Telephone-based social engineering attacks have several properties which distinguish them from phishing email attacks and make their detection difficult. Telephone calls do not have header information or other meta-data which is often used to identify suspicious email attributes such as a false message origin. Telephone calls do have caller ID but this is easily spoofed [19]. Telephone scams may be pre-recorded, but they often involve real-time communication with a live attacker. When a live attacker is involved, the content of the single type of attack can vary as the attacker adjusts the dialogue according to the match the vulnerabilities of the victim [14, 20]. Although the detailed wording of a telephone scam will change with each victim, telephone scams do have recognizable patterns which can be used for detection. For instance, an IRS scam will include the attacker claiming to represent the IRS, or a romance scam will include the attacker expressing affection for the victim. Many governmental organizations actively track and characterize existing scams in order to notify the public. Federal organizations which announce the properties of different scams include the Federal Trade Commission [18], the Federal Communications Commission [41], and the U.S. Marshals Service [42]. In spite of the work that these agencies perform to notify the public, many people are either never exposed to their warnings, or they do not remember the warnings in the moment of an actual attack. ABSTRACT As social engineering attacks have become prevalent, people are increasingly convinced to give their important personal or financial information to attackers. Telephone scams are common and less well-studied than phishing emails. We have found that social engineering attacks can be characterized by a set of speech acts which are performed as part of the scam. A speech act is statements or utterances expressed by an individual that not only conveys information but also performs an action [7]. Although attackers adjust their delivery and wording on the phone to match the victim, scams can be grouped into classes that all share common speech acts. Each scam type is identified by a set of speech acts that are collectively referred to as a scam signature. We present a social engineering detection approach called the Anti-Social Engineering Tool (ASsET), which detects attacks based on the semantic content of the conversation. Our approach uses word embedding techniques from natural language processing to determine if the meaning of a scam signature is contained in a conversation. In order to evaluate our approach, a dataset of telephone scams has been gathered which are written by volunteers based on examples of real scams from official websites. This dataset is the first telephone-based scam dataset, to the best of our knowledge. Our detection method was able to distinguish scam and non-scam calls with high accuracy. CCS CONCEPTS • Security and privacy → Social engineering attacks. KEYWORDS Social engineering attacks; Natural language processing; Scam call detection ACM Reference Format: Ali Derakhshan, Ian G. Harris, and Mitra Behzadi. 2021. Detecting Telephonebased Social Engineering Attacks using Scam Signatures. In Proceedings of the 2021 ACM International Workshop on Security and Privacy Analytics (IWSPA’21), April 28, 2021, Virtual Event, USA. ACM, New York, NY, USA, 7 pages. https://doi.org/10.1145/3445970.3451152 1.1 Scam Signature The basic building block of an attack is a speech act [7], an utterance something expressed by an individual that not only conveys information but also performs an action. For example, the sentence “I would like pizza, can you pass to me?” is a directive speech act since it commands the listener to deliver a pizza. We define a scam signature as a set of utterances that perform speech acts that are collectively unique to a class of social engineering attacks. These utterances are the key points, fulfilling the goal of the scammer for that attack. A scam signature uniquely identifies a class of social engineering attacks in the same way that a malware signature uniquely identifies a class of malware. A sample of an IRS scam is shown in Figure 1. Although the text of this example may vary, it performs the same essential speech 1 INTRODUCTION Social engineering is the act of manipulating people in order to gain something of value [21, 30]. Social engineering is not new, but This work is licensed under a Creative Commons Attribution International 4.0 License. IWSPA’21, April 28, 2021, Virtual Event, USA © 2021 Copyright held by the owner/author(s). ACM ISBN 978-1-4503-8320-2/21/04. https://doi.org/10.1145/3445970.3451152 67 Session 2: Security Analytics IWSPA’21, April 28, 2021, Virtual Event, USA acts which are shown in Figure 2. The sentences which perform the speech acts are labeled with superscripts 𝑎 and 𝑏. loss. As there is no dataset of scam calls, we performed a study to create a dataset. We have collected scam conversation samples from legitimate official websites, which advertise scams to increase public awareness. Then we gave these samples to volunteers to write versions of these attacks in their own words. The scam call dataset is presented in section 4 and we believe that it is the first such dataset available. Scammer: Hello, is this Bob Smith? Victim: Yes, this is he. Scammer: “I’m John, calling from the Social Security Administration. It has come to our attention that there has been suspicious activity on your account spread out over the last six months. Due to participation in highly illicit activities, we have contacted law enforcement agencies to suspend your social security number effective immediately.”a Victim: What? I haven’t done anything illegal. Could my identity have gotten stolen? I was emailed about a breach in a government database recently... Scammer: “Tracing your account activity over the past 10 years, it does seem likely that this recent string of activity is due to someone else using your identity for their illicit activity.”b 2 a First b Second Figure 1: Example of IRS scam. First: we will suspend your social security number on an immediate basis Second: as we have received suspicious trail of information in your names 2.1 Figure 2: IRS Scam Signature 1.2 Manual Scam Signatures The key characteristics of many common scams are well known and are available in the public domain. Using these scam descriptions, it is straightforward to manually create scam signatures. In order to demonstrate this process, we have created scam signatures for five known scams by selecting subsets of their sentences found in publicly-available documents. Table 1 shows the sentences contained in each of the scam signatures that we created. The Signature Number column identifies each scam as well as the public source where the scam description was found. Each signature is a set of 1 to 3 sentences that were taken directly from the published scams. When the number of scams examples available is limited, creating manual scam signatures is an option. Scam Detection Approach We present the Anti-Social Engineering Tool (ASsET) which detects attacks by using scam signatures in the same way that malware signatures are used by anti-virus tools. The key aspect of our approach is the use of word embeddings and sentence embeddings to identify statements in a conversation which have the same meaning as sentences in a signature [27]. It is common in natural language processing to represent the meaning of text as an n-dimensional vector. A word embedding is a vector representation of words with the property that two words with the same meaning are represented by vectors which are close to each other. Word embeddings have been extended to sentence embeddings [10] which represent the meanings of phrases and sentences.By using sentence embeddings, we can identify sentences in a conversation with the same meaning as signature sentences, even when the meaning is expressed differently in English. 1.3 SCAM SIGNATURES We define a scam signature, as a set of utterances that each contains one or more speech acts[7], that collectively fulfill the goal of scammer in a scam call. A speech act is some utterance that not only conveys meaning but also performs an action. For example, the sentence “Can, you pass me the pizza, I want some.”, is a directive speech act which commands the listener to perform an action. Attackers can change elements of an attack, but key characteristics of the attack remain consistent. There are many law enforcement and news organizations which identify key characteristics of common scams in order to increase public awareness [1, 25]. For example, in the IRS scam calls, the “lawforseniors” website[1], created a few highlights that indicates what could be a scam. One highlight is that the IRS never asks for credit card credentials on the phone. Another highlight is that IRS never says that “If you don’t pay money immediately, you would be arrested”. Moreover, you always have the opportunity to appeal. Based on these points, if someone calls and claims to be from IRS and asks for credit card credentials, or ask for money and treats, you would be arrested if you don’t pay, the call is certainly a scam call. 2.2 Scam signatures using clustering When a sufficient number of scam examples are available, a scam signature can be defined automatically by clustering based on utterance similarity. Using clustering techniques, we can find the patterns in conversations vectors by finding the clusters’ centroids and using them as signature vectors, identifying the scam. We used k-means clustering for this purpose. For the number of samples that we had, we set the number of clusters as the square root of the length of shortest conversation for that scam. We performed k-means clustering to generate a set of centroids which are used as the signature vectors. For example, in the IRS example in Figure 1, the conversation has ten utterances. If we have five conversations with ten utterances, the number of clusters would be 3, as it is the Scam Dataset To evaluate our scam call detection system, we need samples of scam calls. Although telephone scams occur frequently, we could not find a large number of actual telephone scam conversations. The main reason for lack of such a dataset is that victims do not usually record their phone calls, and if they did, they are embarrassed to share their conversation with a scammer that leads to their financial or identity 68 Session 2: Security Analytics Signature Number Signature 1[5] Signature 2[2] Signature 3[4] Signature 4[2] Signature 5[3] IWSPA’21, April 28, 2021, Virtual Event, USA Utterances in each Signature we will suspend your social security number on an immediate basis as we have received suspicious trail of information in your names revert as soon as possible on our number, before we begin with the legal proceedings. So you never received anything showing dollars we have audit of your taxes between some years. your bank account will be seized, your credit report will be spoiled, your passport along with State ID will be seized. Do you have the money with you? we will suspend all bank accounts and tax returns bearing your name and social security number To review immediate rights and details, and avoid all further proceedings, please contact our firm you can make a lot of money in a few short months you invest you will receive dollar return on your money in just six months and there is no risk of loss the deal is for today only. The opportunity will be gone tomorrow Table 1: Scam Signatures floor square root of the 10. we would have 50 utterance vectors, and we would have 3 clusters. We obtained the cluster centroids on the training data for each scam. These centroids become scam signatures. So in this approach, instead of manually creating sentences as a scam signature and then vectorizing them, we obtain the signature vectors directly using clustering, so they do not correspond to equivalent sentences in a scam. In the test data, we compare the conversation similarity to each signature in the same way we did it for manual signatures, presented in Algorithm 1. For each scam, the number of clusters is different. So the average value of the "f_similarity" score for train conversations would be different for various scams(scam signatures), which might lead to a bias to prefer to select some scams over others. So we normalize this score for each scam on the test set by the average similarity scores in the training set. To find the threshold, we calculated the f_similarity score of the training conversations to the signatures. The scam conversations generally have a higher similarity than non-scam conversations to the signatures. So the threshold must be in the middle to minimize misclassification error. We get the average similarity of the scam and non-scam conversations to the signatures and set the middle of the averages to be the threshold. 3 comparisons, we use the concept of word embeddings and sentence embeddings, which is a well-accepted approach to capturing the meaning of an utterance as an n-dimensional vector [27]. The great benefit of word embeddings is that the embedding vectors have the property that two utterances with similar meaning will have similar vector representations. Using embeddings allows the meaning of two utterances to be compared by simply computing the dot product of the two vectors. The first accepted word embedding approach was word2vec [29], but there have been several approaches more recently including Glove [32] and ELMo [33]. The assumption is that two words have similar meaning if they are used in the same context inside utterances. Words which are used in the same context, over a large corpus of utterances, will be represented by similar embedding vectors. Word embeddings can be used to compare the meanings of individual words, but we need to compare the meanings of more complex utterances such as entire sentences. Several approaches have been proposed to produce embeddings for sentences, and we have chosen to use the Universal Sentence Encoder approach presented by Google Research [11]. We use their Deep Averaging Network model, which averages the embeddings of the words and bigrams contained in the sentence and passes the result through a feedforward deep neural network. 3.2 SCAM DETECTION We present a social engineering detection approach called AntiSocial Engineering Tool (ASsET), which detects attacks based on the conversation’s semantic content. Each scam signature contains a set of utterances, and we check the existence of each of them in the conversation. Each of these utterances includes one or more speech acts that their presence in a conversation is a sign of a scam. These speech acts themselves can be expressed in many different ways. 3.1 Scam Detection Algorithm Algorithm 1 shows the pseudo-code of our detection approach. The inputs of this Algorithm are a conversation C and the set of all scam signatures, allSignatures. A conversation C is a list of utterances spoken by the communicating parties which must be evaluated to see if a scam is occurring. The set allSignatures contains all scam call signatures that we want to detect in a conversation. The set allSignatures is shown in Table 1. The output of the algorithm is either the signature of the scam, which is contained in the conversation, or NULL if no scam is detected. At code lines 1 and 2 we vectorize the sentences in the conversation C and we vectorize the utterances in the allSignatures set. Vectorization is performed using the embed function which is the the Unviersal Sentence Encoder described in previous work [9]. The vectorization of each sentence gives us a representation of its meaning which can be used for comparison. In code line 3, we have Finding Meaning in Text We need a way to compare the meanings of two utterances to determine if they perform the same speech acts. A single speech act can be expressed in many different ways, and our comparison approach must be independent of this variation. To perform these 69 Session 2: Security Analytics 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 IWSPA’21, April 28, 2021, Virtual Event, USA On code lines 4 to 10, we find the most similar signature, and it’s similarity score (𝑏𝑒𝑠𝑡𝑆𝑖𝑚 variable). At first, we find the similarity of a conversation to each signature. The f_similarity function accepts a conversation and a signature and returns a similarity value to that signature. To understand this function better, Figure 3 gives a real example of this calculation based on our IRS example. The conversation in Figure 1 has ten utterances, each of which corresponds to a row in Figure 3. The signature that we are comparing to, shown in Figure 2, contains two signature vectors, 𝑣 1 and 𝑣 2 , each of which corresponds to a column in Figure 3. The contents of the table shown in Figure 3 are the similarity values between the corresponding utterance and signature vectors. The similarity values are computed as the inner products between the corresponding vectors. Then we find the most similar utterance to a signature vector(max of each column), which are highlighted cells in this Figure. The third utterance in the conversation is the most similar utterance to the first vector of the signature. The fifth utterance in the conversation is the most similar utterance to the second vector of the signature. Then we average these values to get the similarity of a conversation and a signature. In code lines 6 to 9, we keep track of the most similar signature, and it’s similarity. After going through all the signatures, we have the most similar signatures, and it’s similarity score(𝑏𝑒𝑠𝑡𝑆𝑖𝑚 variable). In code lines 11 to 15, we check conversation to see whether it’s most similar signature is above a threshold or not. If it is below the threshold, we classify that conversation as a non-scam and return NULL to signify this. If it is above the threshold, it is a scam conversation, and we return the signature. We apply this to all the conversations in our test set. input : conversation C output : allSignatures C=embed(C); allSignatures=embed(allSignatures); bestSim=0; for signature in allSignatures do sigSimilarity=f_similarity(C, signature); if sigSimilarity> bestSim then bestSim=sigSimilarity; bestSig = signature end end if bestSim > Threshold then Return bestSig; else return NULL; end Result: Best Matching Signature Or NULL Algorithm 1: Detecting scam signatures defined bestSim=0, which is used to keep track of the best similarity found between the conversation and a signature. 4 Figure 3: f_similarity for the IRS example in Figures 1 and 2 1 2 3 4 5 SCAM DATASET To evaluate our detection approach, we need a dataset composed of both scam conversations and non-scam conversations. For the non-scam conversations, we use the CallHome dataset, part of the TalkBank project [28], which contains transcripts of 140 spoken English conversations(120 Participants and 176 files which 140 of them are conversations). Although there exist non-scam datasets, to the best of our knowledge, there is no existing dataset of telephone scam conversations. Although we did find some individual scams which were publicly available, we could not find enough publicly-available scams to enable us to evaluate our approach. There are many datasets of phishing emails, but telephone scams are of particular interest to us due to their unique nature. A likely reason for the lack of real telephone scam datasets is the existence of wiretap laws in some states which prevent the recording of calls without the consent of both parties. We have conducted a human subject study to generate a set of scams that can be used to evaluate scam detection. We recruited 15 computer science graduate students from our university asked them to write scam conversations in their own words. Each subject was given five prompts taken from public websites, which presented examples of different types of telephone scams. Each participant was requested to write their own version of each of the five types of scams, using the prompts as a guide. The prompts, were derived Function f_similarity(Conversation C, signature S) Data: a conversation C and a signature S A conversation is a list of 𝑢 utterance vectors A signature S is a list of 𝑣 vectors. similarityTable=innerProduct(C, S); maxMatches=signatureUtteranceMax(similarityTable); sigSimilarity=mean(maxMatches); Result: A value showing the similarity of a conversation to a signature end Algorithm 2: The f_similarity function 70 Session 2: Security Analytics IWSPA’21, April 28, 2021, Virtual Event, USA from publicly available websites that raise public awareness of scams. 5.1 Scam Number # Conv. Mean Variance Max Min 1[5] 15 1.6 5.04 10 1 2[2] 15 1 0 1 1 3[4] 15 24.46 41.58 30 9 4[2] 15 1 0 1 1 5[3] 15 17.8 13.35 26 10 Table 2: Statistics of the lenght of scam conversations in the dataset 𝑇ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑 = 5.2 RESULTS Our dataset consists of the 75 telephone scam call samples generated by our study and the 140 non-scam telephone call transcripts from the CallHome dataset[28]. Training/Test set. To find our model parameters, threshold and the cluster centroids, we need to have a training set separated from the test set to ensure the parameters are selected without observing the test set. We select the training set by choosing 30 non-scam conversations and 40 scam conversations, 8 scam conversations for each of the 5 types of scams. The test set includes 110 non-scam conversations and the remaining 35 scam conversations (7 of each scam type) which are not in the training set. (1) Manual Signatures result Accuracy Precision Recall F-measure non-scam 0.903 0.907 0.973 0.939 Scam 1[5] 0.993 0.875 1.000 0.933 Scam 2[2] 0.959 0.667 0.286 0.400 Scam 3[4] 0.979 0.833 0.714 0.769 Scam 4[2] 0.972 0.800 0.571 0.667 Scam 5[3] 0.972 0.800 0.571 0.667 Mean 0.963 0.814 0.686 0.729 Variance 0.001 0.006 0.061 0.034 Table 3: Results of multi-class classification Table 4 shows the overall performance of multi-class classification by averaging the results in Table 3. Macro averaging computes an average of the F_measure for each class, while micro averaging computes the F_measure from the average precision and recall scores of each class. Figure 4: k-means Clustering accuracy by increasing the percentage of the training data In Figure 4, we gradually increased the percentage of the scam conversations in the training set(The number of non-scam conversations is fixed) and plotted the test accuracy(remaining conversations) using our clustering method. The accuracy is generally high, even with 20 percent of the scam conversations, and it slightly increases when using about half of the scam conversations for training. We chose to use approximately 53.3% of the scam conversations because that it where the peak accuracy is achieved. 1 The 𝑎𝑣𝑒𝑟𝑎𝑔𝑒𝑆𝑐𝑜𝑟𝑒 (𝑠𝑐𝑎𝑚) + 𝑎𝑣𝑒𝑟𝑎𝑔𝑒𝑆𝑐𝑜𝑟𝑒 (𝑛𝑜𝑛𝑆𝑐𝑎𝑚) 2 We show two sets of results. In the first set of results, we perform multi-class classification, classifying each call as either non-scam or one of the five types of the scam which we consider. In the second set of results, we perform binary classification with two classes, either scam or non-scam. Table 3 shows the results of multi-class classification. For each scam type, we have calculated the Accuracy, Precision, Recall, and F-measure. The accuracies for all scam types are higher than 90%, but the recall is relatively low for scam type 2. We believe that this is because the prompt provided to the study participants for this scam was only two sentences long. As a result, the participants included a larger amount of elaboration, causing the scams to diverge from the prompt." Table 2 shows statistics describing our scam dataset. The dataset contains 15 conversations of each scam type, with a total of 75 scam conversations in all. The table also shows the variation in the length (lines of text) of the scams produced by the participants. This dataset has been made publicly available.1 5 Similarity Threshold Our detection approach depends on a similarity threshold which is used to determine whether or not a conversation is sufficiently close to a signature. To find the threshold we evaluate the similarity scores of the conversations in the training set. To find the similarity threshold we used the Equation 1. Macro averaging Micro Averaging F_measure 0.729 0.889 Table 4: F_measure general performance using Micro and Macro averaging Table 5 shows the results of binary classification, either scam or non-scam. These results better indicate our method’s ability to detect scam calls as some scam types are similar. link to the dataset repository is removed for anonymity. 71 Session 2: Security Analytics IWSPA’21, April 28, 2021, Virtual Event, USA Accuracy Precision Recall F-measure Non-Scam 0.903 0.906 0.972 0.938 Scam 0.903 0.888 0.685 0.774 AUC 0.829 Table 5: Results of binary classification 5.3 Accuracy Precision Recall F-measure Non-Scam 0.931 0.916 1.0 0.956 Scam 0.931 1.0 0.714 0.833 AUC 0.857 Table 8: Results of binary classification in k-mean clustering k-means clustering results Also, we have provided the AUC of this binary classification. The precision or the detecting scams is “1.0”, but the recall is “0.714”, which are good results, but the recall can still be improved. For the clustering results, the train and the test sets are the same. The only difference is the signature vectors in this experiment, and they are automatically being created using the k-means clustering method. In the test conversations, we check if the score of the most similar signature is above a threshold(we obtained using the training set and described in section 5.1); Then, we classify the conversation as a sample of that scam. Otherwise, we classify it as a non-scam conversation. In Table 6, the accuracy, precision, recall, and f-measure of our classification are presented. As it can be seen in all the scam types, the precision is “1.0”, it means that if we have classified a conversation scam, it would be a genuine scam(no false positive), but we might have missed some conversations. Having no false positive is crucial since we do not want to detect a vital call as a scam and interrupt it. We want to detect a call as a scam only when we are sure about it. This result shows that this approach is working with high accuracy without making problems for regular calls. We do not need to create signature sentences manually, as we can get them automatically using the k-means clustering approach. 6 RELATED WORK We summarize related research in the detection of social engineering and phishing attacks, but a more detailed exposition of this research can be found in a survey on the topic [13]. Previous work in the detection of social engineering attacks can be viewed according to the algorithmic approach used as well as the features used by the algorithm. Several of the early approaches are rule-based [12, 22, 43] while most newer techniques use some form of machine learning [6, 8, 17, 26, 31, 34, 35]. Rule-based algorithmic approaches have used statistical methods to identify anomalous emails or web pages. For example, the CANTINA technique [43] evaluates web pages by using the TF-IDF metric to rank the words found on a web page and then sends the top 5 words to a search engine to see if the page is found. An approach to spear phish detection [22] produced excellent detection results by ranking click-in-mail events based on their count of “suspicious” features as compared to all other events. Phishing detection approaches using machine learning extract a set of features from phishing emails and then create a classifier to identify phishing emails based on the features. Most machine learning papers develop many different classifiers for comparison. For instance, researchers in [26] use Multi-Layer Perceptron (MLP) Neural Networks, Naive Bayes Classification, Decision Tree, and Random Forest. The success of previous approaches to phishing detection depends heavily on their selection of features to extract from the email or web page being evaluated. Many features are based on “metadata” related to an email including MTP headers, NIDS logs, LDAP logs, and cc lists [15, 22, 36]. Some features used consider the content of the email, including whether or not it contains HTML and JavaScript [35] and the structure of UML links found in the email [6, 8]. A common feature of the content is the frequency of certain words whose use are associated with phishing emails or web pages [6, 8, 26, 35]. For example, in [40] researchers associate a set of words with a sense of urgency, which is commonly conveyed in phishing emails. Some researchers used semantic features in the emails to detect scams[38, 39]. Also, they have used NLP-techniques to detect email spams, and they showed that NLP-techniques could robustly detect them[16]. Accuracy Precision Recall F_measure non-scam 0.931 0.917 1.000 0.957 Scam 1 1.000 1.000 1.000 1.000 Scam 2 0.979 1.000 0.571 0.727 Scam 3 0.979 1.000 0.571 0.727 Scam 4 0.986 1.000 0.714 0.833 Scam 5 0.986 1.000 0.714 0.833 Mean 0.977 0.986 0.762 0.846 Variance 0.000 0.001 0.032 0.011 Table 6: Results of multi-class classification with k-mean clustering By comparing Table 6 with Table 3, we can see that the results for the k-means clustering method achieves better results than the manual signature method. Table 7, which reports F_measure using Macro and Micro averaging, shows that these results are better than manual signatures as well. Macro averaging Micro Averaging F_measure 0.846 0.931 Table 7: F_measure general performance using Micro and Macro averaging with k-means clustering 7 CONCLUSIONS We present the idea of a scam signature to identify a class of social engineering attacks uniquely, in much the same way that malware signatures are used to identify malware. The signatures are based on the content of the conversation rather than any meta-data, so they Table 8 shows the results when we group all the scams, and perform a binary classification between scam and non-scam classes. 72 Session 2: Security Analytics IWSPA’21, April 28, 2021, Virtual Event, USA can be applied to telephone-based and in-person attacks which have no meta-data. We demonstrate the effectiveness of scam signatures by using them to implement a social engineering detection tool, ASsET, which compares signatures to a conversation to determine if an attack is being performed. A sentence embedding approach used widely in the NLP domain is used to compare the meaning of the signature to the meaning of sentences in the conversation. In order to demonstrate the effectiveness of our approach for the detection of telephone-based attacks, we have gathered a set of realistic attacks by performing a human subject study in which volunteers created scam scripts based on a set of existing telephone scams, which were provided as prompts. By evaluating our detection approach with our scam dataset, together with a set of non-scam conversations, we have shown that our detection approach has high accuracy and F-score. We have also made our telephone scam dataset publicly available to support future research in social engineering attacks. [17] Ian Fette, Norman Sadeh, and Anthony Tomasic. 2007. Learning to Detect Phishing Emails. In Proceedings of the 16th International Conference on World Wide Web. [18] ftc 2020. Federal Trade Commission, Scams. Federal Trade Commission. https: //www.consumer.ftc.gov/features/scam-alerts [19] ftc2 2016 (accessed June 11, 2020). Federal Trade Commission, Scams. Federal Trade Commission. https://www.consumer.ftc.gov/blog/2016/05/scammers-canfake-caller-id-info [20] Christopher Hadnagy. 2011. Social Engineering The Art of Human Hacking. Wiley Publishing Inc. [21] C. Hadnagy and P. Wilson. 2010. Social Engineering: The Art of Human Hacking. Wiley. [22] Grant Ho, Aashish Sharma, Mobin Javed, Vern Paxson, and David Wagner. 2017. Detecting Credential Spearphishing in Enterprise Settings. In 26th USENIX Security Symposium (USENIX Security 17). [23] Tom N. Jagatic, Nathaniel A. Johnson, Markus Jakobsson, and Filippo Menczer. 2007. Social phishing. Commun. ACM 50, 10 (2007), 94–100. [24] Martin Kaste. 2019 (accessed June 11, 2020). Cybercrime Booms As Scammers Hack Human Nature To Steal Billions. National Public Radio. https://www.npr.org/2019/11/18/778894491/cybercrime-booms-asscammers-hack-human-nature-to-steal-billions [25] Allen Kim. [n.d.]. A scam targeting Americans over the phone has resulted in millions of dollars lost to hackers. Don’t be the next victim. https://www.cnn.com/ 2019/10/27/business/phishing-bank-scam-trnd/index.html [26] Merton Lansley, Francois Mouton, Stelios Kapetanakis, and Nikolaos Polatidis. 2020. SEADer++: social engineering attack detection in online environments using machine learning. Journal of Information and Telecommunication (2020). [27] Yang Li and Tao Yang. 2018. Word Embedding for Understanding Natural Language: A Survey. Springer International Publishing. [28] Brian MacWhinney and Johannes Wagner. 2010. Transcribing, searching and data sharing: The CLAN software and the TalkBank data repository. Gesprachsforschung: Online-Zeitschrift zur verbalen Interaktion 11 (2010), 154. [29] Tomas Mikolov, Kai Chen, Greg S. Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. http://arxiv.org/abs/1301. 3781 [30] K.D. Mitnick and W.L. Simon. 2009. The Art of Intrusion: The Real Stories Behind the Exploits of Hackers, Intruders and Deceivers. Wiley. [31] Ying Pan and Xuhua Ding. 2006. Anomaly Based Web Phishing Page Detection. In Computer Security Applications Conference, 2006. ACSAC ’06. 22nd Annual. [32] Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. GloVe: Global Vectors for Word Representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). [33] Matthew Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep Contextualized Word Representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). [34] Venkatesh Ramanathan and Harry Wechsler. 2012. phishGILLNET—phishing detection methodology using probabilistic latent semantic analysis, AdaBoost, and co-training. EURASIP Journal on Information Security (2012). [35] H. Sandouka, A. J. Cullen, and I. Mann. 2009. Social Engineering Detection Using Neural Networks. In 2009 International Conference on CyberWorlds. 273–278. [36] Gianluca Stringhini and Olivier Thonnard. 2015. That Ain’t You: Blocking Spearphishing Through Behavioral Modelling. In Proceedings of the 12th International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment - Volume 9148 (DIMVA 2015). [37] Verizon. 2019. 2019 Data Breach Investigations Report. https://enterprise.verizon. com/resources/reports/dbir/. [38] Rakesh Verma and Nabil Hossain. 2013. Semantic feature selection for text with application to phishing email detection. In International Conference on Information Security and Cryptology. Springer, 455–468. [39] Rakesh Verma and Nirmala Rai. 2015. Phish-IDetector: Message-ID based automatic phishing detection. In 2015 12th International Joint Conference on e-Business and Telecommunications (ICETE), Vol. 4. IEEE, 427–434. [40] Rakesh Verma, Narasimha Shashidhar, and Nabil Hossain. 2012. Detecting Phishing Emails the Natural Language Way. In Computer Security – ESORICS 2012, Sara Foresti, Moti Yung, and Fabio Martinelli (Eds.). [41] Patrick Webre. 2019. Exposing Voicemail Call-Back Scams. Federal Communications Commission Blog. https://www.fcc.gov/news-events/blog/2019/08/28/ exposing-voicemail-call-back-scams [42] Patrick Webre. 2020. UPDATED FRAUD ADVISORY (March 2020). U.S. Marshals Service. https://www.usmarshals.gov/news/chron/2019/scam-alerts.htm [43] Yue Zhang, Jason I. Hong, and Lorrie F. Cranor. 2007. Cantina: A Content-based Approach to Detecting Phishing Web Sites. In Proceedings of the 16th International Conference on World Wide Web. ACKNOWLEDGMENTS This material is based upon work supported by the National Science Foundation under Grant No. 1813858. REFERENCES [1] [n.d.]. TAX SCAMS THAT TARGET MILLIONS OF AMERICANS. https://www.lawforseniors.org/topics/consumer-scams/305-tax-scamsthat-target-millions-of-americans [2] 2020. Exposing Voicemail Call-Back Scams. https://www.fcc.gov/news-events/ blog/2019/08/28/exposing-voicemail-call-back-scams [3] 2020. Investment Fraud Script. https://ag.ny.gov/sites/default/files/pdfs/bureaus/ investor_protection/exhibit_k.pdf [4] 2020. IRS Scam phone Transcript Tax Resolution Institute. https://www. taxresolutioninstitute.com/irs-scam-phone-transcript/ [5] 2020. This is what a Social Security scam sounds like. https://www.consumer. ftc.gov/blog/2018/12/what-social-security-scam-sounds?page=1 [6] Saeed Abu-Nimeh, Dario Nappa, Xinlei Wang, and Suku Nair. 2007. A comparison of machine learning techniques for phishing detection. In Proceedings of the antiphishing working groups 2nd annual eCrime researchers summit. 60–69. [7] Kent Bach and Robert M Harnish. 1979. Linguistic communication and speech acts. (1979). [8] Ram Basnet, Srinivas Mukkamala, and Andrew H Sung. 2008. Detection of phishing attacks: A machine learning approach. In Soft Computing Applications in Industry. Springer, 373–383. [9] Daniel Cer, Yinfei Yang, Sheng-yi Kong, Nan Hua, Nicole Limtiaco, Rhomni St John, Noah Constant, Mario Guajardo-Cespedes, Steve Yuan, Chris Tar, et al. 2018. Universal sentence encoder. arXiv preprint arXiv:1803.11175 (2018). [10] Daniel Cer, Yinfei Yang, Sheng-yi Kong, Nan Hua, Nicole Limtiaco, Rhomni St. John, Noah Constant, Mario Guajardo-Cespedes, Steve Yuan, Chris Tar, Brian Strope, and Ray Kurzweil. 2018. Universal Sentence Encoder for English. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. [11] Daniel Cer, Yinfei Yang, Sheng-yi Kong, Nan Hua, Nicole Limtiaco, Rhomni St. John, Noah Constant, Mario Guajardo-Cespedes, Steve Yuan, Chris Tar, Brian Strope, and Ray Kurzweil. 2018. Universal Sentence Encoder for English. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. [12] Neil Chou, Robert Ledesma, Yuka Teraguchi, Dan Boneh, and John C. Mitchell. 2004. Client-Side Defense against Web-Based Identity Theft. In Network and Distributed Systems Security Symposium (NDSS). [13] A. Das, S. Baki, A. El Aassal, R. Verma, and A. Dunbar. 2020. SoK: A Comprehensive Reexamination of Phishing Research From the Security Perspective. IEEE Communications Surveys Tutorials 22, 1 (2020), 671–708. [14] Robin Dreeke. 2013. It’s not all about "me", the top ten techniques for building quick rapport with anyone. People Formula. [15] S. Duman, K. Kalkan-Cakmakci, M. Egele, W. Robertson, and E. Kirda. 2016. EmailProfiler: Spearphishing Filtering with Header and Stylometric Features of Emails. In 2016 IEEE 40th Annual Computer Software and Applications Conference (COMPSAC), Vol. 1. [16] Gal Egozi and Rakesh Verma. 2018. Phishing email detection using robust nlp techniques. In 2018 IEEE International Conference on Data Mining Workshops (ICDMW). IEEE, 7–12. 73