KC7 - The free cyber detective game

<style>pre { line-height: 125%; } td.linenos .normal { color: #6e7681; background-color: #0d1117; padding-left: 5px; padding-right: 5px; } span.linenos { color: #6e7681; background-color: #0d1117; padding-left: 5px; padding-right: 5px; } td.linenos .special { color: #e6edf3; background-color: #6e7681; padding-left: 5px; padding-right: 5px; } span.linenos.special { color: #e6edf3; background-color: #6e7681; padding-left: 5px; padding-right: 5px; } .codehilite .hll { background-color: #6e7681 } .codehilite .c { color: #8b949e; font-style: italic } /* Comment */ .codehilite .err { color: #f85149 } /* Error */ .codehilite .esc { color: #e6edf3 } /* Escape */ .codehilite .g { color: #e6edf3 } /* Generic */ .codehilite .k { color: #ff7b72 } /* Keyword */ .codehilite .l { color: #a5d6ff } /* Literal */ .codehilite .n { color: #e6edf3 } /* Name */ .codehilite .o { color: #ff7b72; font-weight: bold } /* Operator */ .codehilite .x { color: #e6edf3 } /* Other */ .codehilite .p { color: #e6edf3 } /* Punctuation */ .codehilite .ch { color: #8b949e; font-style: italic } /* Comment.Hashbang */ .codehilite .cm { color: #8b949e; font-style: italic } /* Comment.Multiline */ .codehilite .cp { color: #8b949e; font-weight: bold; font-style: italic } /* Comment.Preproc */ .codehilite .cpf { color: #8b949e; font-style: italic } /* Comment.PreprocFile */ .codehilite .c1 { color: #8b949e; font-style: italic } /* Comment.Single */ .codehilite .cs { color: #8b949e; font-weight: bold; font-style: italic } /* Comment.Special */ .codehilite .gd { color: #ffa198; background-color: #490202 } /* Generic.Deleted */ .codehilite .ge { color: #e6edf3; font-style: italic } /* Generic.Emph */ .codehilite .gr { color: #ffa198 } /* Generic.Error */ .codehilite .gh { color: #79c0ff; font-weight: bold } /* Generic.Heading */ .codehilite .gi { color: #56d364; background-color: #0f5323 } /* Generic.Inserted */ .codehilite .go { color: #8b949e } /* Generic.Output */ .codehilite .gp { color: #8b949e } /* Generic.Prompt */ .codehilite .gs { color: #e6edf3; font-weight: bold } /* Generic.Strong */ .codehilite .gu { color: #79c0ff } /* Generic.Subheading */ .codehilite .gt { color: #ff7b72 } /* Generic.Traceback */ .codehilite .g-Underline { color: #e6edf3; text-decoration: underline } /* Generic.Underline */ .codehilite .kc { color: #79c0ff } /* Keyword.Constant */ .codehilite .kd { color: #ff7b72 } /* Keyword.Declaration */ .codehilite .kn { color: #ff7b72 } /* Keyword.Namespace */ .codehilite .kp { color: #79c0ff } /* Keyword.Pseudo */ .codehilite .kr { color: #ff7b72 } /* Keyword.Reserved */ .codehilite .kt { color: #ff7b72 } /* Keyword.Type */ .codehilite .ld { color: #79c0ff } /* Literal.Date */ .codehilite .m { color: #a5d6ff } /* Literal.Number */ .codehilite .s { color: #a5d6ff } /* Literal.String */ .codehilite .na { color: #e6edf3 } /* Name.Attribute */ .codehilite .nb { color: #e6edf3 } /* Name.Builtin */ .codehilite .nc { color: #f0883e; font-weight: bold } /* Name.Class */ .codehilite .no { color: #79c0ff; font-weight: bold } /* Name.Constant */ .codehilite .nd { color: #d2a8ff; font-weight: bold } /* Name.Decorator */ .codehilite .ni { color: #ffa657 } /* Name.Entity */ .codehilite .ne { color: #f0883e; font-weight: bold } /* Name.Exception */ .codehilite .nf { color: #d2a8ff; font-weight: bold } /* Name.Function */ .codehilite .nl { color: #79c0ff; font-weight: bold } /* Name.Label */ .codehilite .nn { color: #ff7b72 } /* Name.Namespace */ .codehilite .nx { color: #e6edf3 } /* Name.Other */ .codehilite .py { color: #79c0ff } /* Name.Property */ .codehilite .nt { color: #7ee787 } /* Name.Tag */ .codehilite .nv { color: #79c0ff } /* Name.Variable */ .codehilite .ow { color: #ff7b72; font-weight: bold } /* Operator.Word */ .codehilite .pm { color: #e6edf3 } /* Punctuation.Marker */ .codehilite .w { color: #6e7681 } /* Text.Whitespace */ .codehilite .mb { color: #a5d6ff } /* Literal.Number.Bin */ .codehilite .mf { color: #a5d6ff } /* Literal.Number.Float */ .codehilite .mh { color: #a5d6ff } /* Literal.Number.Hex */ .codehilite .mi { color: #a5d6ff } /* Literal.Number.Integer */ .codehilite .mo { color: #a5d6ff } /* Literal.Number.Oct */ .codehilite .sa { color: #79c0ff } /* Literal.String.Affix */ .codehilite .sb { color: #a5d6ff } /* Literal.String.Backtick */ .codehilite .sc { color: #a5d6ff } /* Literal.String.Char */ .codehilite .dl { color: #79c0ff } /* Literal.String.Delimiter */ .codehilite .sd { color: #a5d6ff } /* Literal.String.Doc */ .codehilite .s2 { color: #a5d6ff } /* Literal.String.Double */ .codehilite .se { color: #79c0ff } /* Literal.String.Escape */ .codehilite .sh { color: #79c0ff } /* Literal.String.Heredoc */ .codehilite .si { color: #a5d6ff } /* Literal.String.Interpol */ .codehilite .sx { color: #a5d6ff } /* Literal.String.Other */ .codehilite .sr { color: #79c0ff } /* Literal.String.Regex */ .codehilite .s1 { color: #a5d6ff } /* Literal.String.Single */ .codehilite .ss { color: #a5d6ff } /* Literal.String.Symbol */ .codehilite .bp { color: #e6edf3 } /* Name.Builtin.Pseudo */ .codehilite .fm { color: #d2a8ff; font-weight: bold } /* Name.Function.Magic */ .codehilite .vc { color: #79c0ff } /* Name.Variable.Class */ .codehilite .vg { color: #79c0ff } /* Name.Variable.Global */ .codehilite .vi { color: #79c0ff } /* Name.Variable.Instance */ .codehilite .vm { color: #79c0ff } /* Name.Variable.Magic */ .codehilite .il { color: #a5d6ff } /* Literal.Number.Integer.Long */</style><h1>Welcome to EnvolveLabs!</h1> <p>🥳 Welcome to Envolve Labs Corporation! Today is your first day as a Junior Security Operations Center (SOC) Analyst with our company. Your primary job responsibility is to defend Envolve Labs and its employees from malicious cyber actors.</p> <p><img alt="image" src="https://github.com/Squiblydoo/kc7_data/assets/77356206/9885f1e0-01c4-4f4d-82e2-ef08ea0272b2" /></p> <p>Envolve Labs is a med-tech startup based in the United States that was founded in 2012. Our mission is to develop a new type of flexible vaccine technology that covers many different viral strains and offers long-lasting immunity (which means no more boosters!) Our initial research has proven this technology is highly effective – we’re planning to start production in Q1 2023.</p> <p>EnvolveLabs has a series of key partners who contribute to the success of our business:</p> <table> <thead> <tr> <th>Partner Name</th> <th>Relationship</th> </tr> </thead> <tbody> <tr> <td>We Sell Beakers™ (wesellbeakers.com)</td> <td>A key distributor of medical-grade laboratory equipment, We Sell Beakers™ provides all of the equipment for our world-class research facilities.</td> </tr> <tr> <td>PharmaSupplies, Inc. (pharmasupplies.org)</td> <td>PharmaSupplies provides Envolve Labs withthe core ingredients used in our vaccine products.</td> </tr> <tr> <td>Vaccine Distributors Worldwide (vaccinedistributors.com):</td> <td>EnvolveLabs trusts Vaccine Distributors Worldwide to deliver our temperature and time-sensitive vaccine products to customers around the globe.</td> </tr> <tr> <td>Research Compliance, PSC (researchcompliance.com)</td> <td>Research Compliance, PSC is the key legal partner of Envolve Labs who helps ensure compliance with international guidelines and regulations that govern vaccine production.</td> </tr> </tbody> </table> <p>Until now, we’ve been laser focused on medical research and meeting production goals. But, as our work becomes more important and successful, we’ve realized the need to invest more in cybersecurity efforts. That’s why we’ve hired you!</p> <p>Like all good companies, Envolve Labs collects log data about the activity its employees perform on the corporate network. These security audit logs are stored in Azure Data Explorer (ADX) - a data storage service in Azure (Microsoft’s cloud). You will use the Kusto Query Language (KQL) to parse through various types of security logs. By analysing these logs, you can help us determine whether we’re being targeted by malicious actors.</p> <p><img alt="image" src="https://github.com/Squiblydoo/kc7_data/assets/77356206/61f2adcd-0cd4-4e6e-a007-7fcbcd9e1c72" /></p> <h2>Objectives</h2> <p>🧠 <strong>By the end of your first day on the job, you should be able to:</strong></p> <ul> <li>Use the Azure Data Explorer</li> <li>Use multiple data sets to answer targeted questions</li> <li>Find cyber activity in logs including: email, web traffic, and server logs</li> <li>Use multiple techniques to track the activity of APTs (Advanced Persistent Threats)</li> <li>Use third party data sets to discover things about your attackers</li> <li>Build a threat intelligence report</li> <li>Make recommendations on what actions a company can take to protect themselves</li> </ul> <p>🚀 The attackers have gotten a head start, so let’s not waste any more time… time to get to work!</p> <h3>Legend</h3> <blockquote> <p>🎯 Key Point – Occasionally, you will see a dart emoji with a “key point.” These signal explanations of certain concepts that may enhance your understanding of key cybersecurity ideas that are demonstrated in the game. </p> <p>🤔 Question – “Thinking” emojis represent questions that will enable you to demonstrate mastery of the concepts at hand. You can earn points by entering your responses to questions from section 3 in the scoring portal available at kc7cyber.com/scoreboard.</p> <p>🤫 Hint – “Whisper” emojis represent in-game hints. These hints will guide you in the right direction in answering some of the questions.</p> </blockquote> <h2>Section 1: The Walkthrough</h2> <h4>Getting Set Up in Azure Data Explorer (ADX)</h4> <p>ADX is the primary tool used in the Envolve Labs SOC for data exploration and analysis. The great thing about ADX is that it is used by cyber analysts at many of the smallest and largest organizations in the world.</p> <p>Let’s get you logged in and started with ADX: 1. Go to https://dataexplorer.azure.com/ and login with your @microsoft.com account credentials. 2. Click the Query tab on the left side of the screen, or this may already be selected for you.</p> <p><img alt="image" src="https://github.com/Squiblydoo/kc7_data/assets/77356206/57a418ea-d89d-43cf-8560-b2c61ad8efb9" /></p> <p>Data in ADX is organized in a hierarchical structure which consists of clusters, databases, and tables. All of Envolve Labs’s security logs are stored in a single cluster. Once you login, you should see a cluster called <em>“kc7001.eastus”</em> has already been added to your account.</p> <p><img alt="image" src="https://github.com/Squiblydoo/kc7_data/assets/77356206/01d730eb-bbcd-4b2f-bb43-3a704921f76e" /></p> <p>Data in ADX is organized in a hierarchical structure which consists of <strong>clusters</strong>, <strong>databases</strong>, and <strong>tables</strong>.</p> <p><img alt="image" src="https://github.com/Squiblydoo/kc7_data/assets/77356206/d912c393-10d6-4db3-962d-519948535952" /></p> <p>All of Envolve Labs' security logs are stored in a single database – the EvolveLabs database. </p> <ol> <li>Select your database. <ul> <li>Expand the dropdown arrow next to the EvolveLabs database.</li> <li>Click on the <strong>EvolveLabs</strong> database. Once you’ve done this, you should see the database highlighted- this means you’ve selected the database and are ready to query the tables inside.</li> </ul> </li> </ol> <p><img alt="image" src="https://github.com/Squiblydoo/kc7_data/assets/77356206/48b9258a-7476-45e5-8c56-66b615e38968" /></p> <p>Note: It’s very important that you use the EvolveLabs database for all questions while you’re investigating activity at EvolveLabs! If you choose the wrong database, you won’t be able to answer questions correctly.</p> <p>The big space to the right of your cluster list is the <em>query workspace</em>. That’s where you’ll actually write the queries used to interact with our log data.</p> <p><img alt="image" src="https://github.com/Squiblydoo/kc7_data/assets/77356206/15c43c98-0dac-40de-a282-5894afc1606b" /></p> <p>Click the blue <strong>Run</strong> button above the query workspace to run your first query! Once you’ve done that, you can erase the welcome message by highlighting it and pressing backspace or delete on your keyboard.</p> <p>Okay, enough introductions… let’s get your hands on the data.</p> <h4>First Look at the data...</h4> <p>The SecurityLogs database contains eight tables. Tables contain many rows of similar data. For security logs, a single row typically represents a single thing done by an employee or a device on the network at a particular time. We currently have eight types of log data. As you’ll see in ADX, each log type corresponds to a table that exists in the SecurityLogs database:</p> <table> <thead> <tr> <th><strong>Table Name</strong></th> <th><strong>Description</strong></th> </tr> </thead> <tbody> <tr> <td>Employees</td> <td>Contains information about the company’s employees</td> </tr> <tr> <td>Email</td> <td>Records emails sent and received by employees</td> </tr> <tr> <td>InboundNetworkEvents</td> <td>Records inbound network events including browsing activity from the Internet to devices within the company network</td> </tr> <tr> <td>OutboundNetworkEvents</td> <td>Records outbound network events including browsing activity from within the company network out to the Internet</td> </tr> <tr> <td>AuthenticationEvents</td> <td>Records successful and failed logins to devices on the company network. This includes logins to the company’s mail server.</td> </tr> <tr> <td>FileCreationEvents</td> <td>Records files stored on employee’s devices</td> </tr> <tr> <td>ProcessEvents</td> <td>Records processes created on employee’s devices</td> </tr> <tr> <td>PassiveDns (External)</td> <td>Records IP-domain resolutions</td> </tr> <tr> <td>SecurityAlerts</td> <td>Records security alerts from an employee’s device or the company’s email security system</td> </tr> </tbody> </table> <blockquote> <p>🎯<strong>Key Point – Over the Horizon (OTH) data</strong>: One of the tables listed above is not like the others – <strong>PassiveDns</strong>. Rather than being an internal security log, PassiveDns is a data source that we’ve purchased from a 3rd party vendor. Not all malicious cyber activity happens within our company network, so sometimes we depend on data from other sources to complete our investigations.</p> </blockquote> <p>You’ll learn more about how to use each of these datasets in just a minute. First, let’s just run some queries so you can practice using KQL and ADX.</p> <h4>KQL 101</h4> <p>Type the following query in the workspace to view the first rows in the <strong>Employees</strong> table. Press “run” or “shift + enter” to execute the query.</p> <div class="codehilite"><pre><span></span><code><span class="n">Employees</span> <span class="o">|</span><span class="w"> </span><span class="n">take</span><span class="w"> </span><span class="mi">10</span> </code></pre></div> <p>This query has a few parts. Let’s take a moment to break each of them down:</p> <p><img alt="Diagram Description automatically generated with low confidence" src="https://lh4.googleusercontent.com/y7YpmVbUlak4wMrI43nYzAqieamRnwiANLJzy8UsxHHaYJI5SlJVpTdz4EO47A-g0SVI77ehgZxHB13AZXLKKX02Yr40VZmFsb3blZgHCojeD-vS34SS24yWWqK6rnduGNdhmnXlCAQAMWETrm7_hg" /></p> <table> <thead> <tr> <th><strong>Query Component</strong></th> <th><strong>Description</strong></th> </tr> </thead> <tbody> <tr> <td>Table Name</td> <td>The table name specifies which table/data source the query will pull data from. All queries must start with a table.</td> </tr> <tr> <td>Pipe character (|)</td> <td>The pipe character indicates the start of a new part of the query. A pipe will be added automatically after typing a table name and pressing enter. You can also add a pipe character manually by holding shift and pressing the backslash (\) key. That’s the one just below the backspace key.</td> </tr> </tbody> </table> <p>The <strong>take</strong> operator is a powerful tool you can use to explore rows in a table, and therefore better understand what kinds of data are stored there.</p> <blockquote> <p>🎯<strong>Key Point – What to do when you don’t know what to do</strong>: Whenever you are faced with an unfamiliar database table, the first thing you should do is sample its rows using the <strong>take</strong> operator. That way, you know what fields are available for you to query and you can guess what type of information you might extract from the data source.</p> </blockquote> <p>The Employees table contains information about all the employees in our organization. In this case, we can see that the organization is named “EvolveLabs” and the domain is “evolvelabs.com”.</p> <blockquote> <ol> <li>🤔 Try it for yourself! Do a <strong>take</strong> 10 on all the other tables to see what kind of data they contain.</li> </ol> </blockquote> <p>Make sure you record your answer to all the questions from KQL 101 in the scoreboard at kc7cyber.com/scoreboard</p> <p>You can easily write multiple queries in the same workspace tab. To do this, make sure to separate each query by an empty line. Notice below how we have separated the queries for the <strong>Employees</strong>, <strong>Email</strong>, and <strong>OutboundNetworkEvents</strong> tables by adding empty lines between them.</p> <div class="codehilite"><pre><span></span><code><span class="n">Email</span> <span class="o">|</span><span class="w"> </span><span class="n">take</span><span class="w"> </span><span class="mi">10</span> <span class="n">Employees</span> <span class="o">|</span><span class="w"> </span><span class="n">take</span><span class="w"> </span><span class="mi">10</span> <span class="n">OutboundNetworkEvents</span> <span class="o">|</span><span class="w"> </span><span class="n">take</span><span class="w"> </span><span class="mi">10</span> </code></pre></div> <p>When you have multiple queries, it’s important to tell ADX which query you want to run. To choose a query, just click on any line that is part of that query. </p> <p><strong>Finding Out “How Many”: The count Operator</strong></p> <p>We can use <strong>count</strong> to see how many rows are in a table. This tells us how much data is stored there.</p> <div class="codehilite"><pre><span></span><code><span class="n">Employees</span> <span class="o">|</span><span class="w"> </span><span class="k">count</span> </code></pre></div> <ol> <li>🤔 How many employees are in the company?</li> </ol> <p><strong>Filtering Data With the <em>where</em> Operator</strong></p> <p>So far, we’ve run queries that look at the entire contents of the table. Often in cybersecurity analysis, we only want to look at data that meets a set of conditions or criteria. To accomplish this, we apply filters to specific columns.</p> <p>We can use the <strong>where</strong> operator in KQL to apply filters to a particular field. For example, we can find all the employees with the name “Maya” by filtering on the name column in the Employees table.</p> <p><strong>where</strong> statements are written using a particular structure. Use this helpful chart below to understand how to structure a <strong>where</strong> statement.</p> <table> <thead> <tr> <th><strong>where</strong></th> <th><strong>field</strong></th> <th><strong>operator</strong></th> <th><strong>"value"</strong></th> </tr> </thead> <tbody> <tr> <td>where</td> <td>name</td> <td>has</td> <td>"Linda"</td> </tr> </tbody> </table> <div class="codehilite"><pre><span></span><code><span class="n">Employees</span> <span class="o">|</span><span class="w"> </span><span class="k">where</span><span class="w"> </span><span class="n">name</span><span class="w"> </span><span class="n">has</span><span class="w"> </span><span class="ss">"Linda"</span> </code></pre></div> <p>The <strong>has</strong> operator is useful here because we’re looking for only a partial match. If we wanted to look for an employee with a specific first and last name (an exact match), we’d use the == operator:</p> <div class="codehilite"><pre><span></span><code><span class="n">Employees</span> <span class="o">|</span><span class="w"> </span><span class="k">where</span><span class="w"> </span><span class="n">name</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="ss">"Linda Holbert"</span> </code></pre></div> <blockquote> <ol> <li>🤔 Each employee at Envolve Labs is assigned an IP address. Which employee has the IP address: “192.168.0.191”?</li> </ol> </blockquote> <p>Here are some additional operators we like to use:</p> <table> <thead> <tr> <th>Operator</th> <th>Description</th> <th>Case-Sensitive</th> <th>Example (yields true)</th> </tr> </thead> <tbody> <tr> <td>==</td> <td>Equals</td> <td>Yes</td> <td>"aBc" == "aBc"</td> </tr> <tr> <td>!=</td> <td>Not equals</td> <td>Yes</td> <td>"abc" != "ABC"</td> </tr> <tr> <td>=~</td> <td>Equals</td> <td>No</td> <td>"abc" =~ "ABC"</td> </tr> <tr> <td>contains</td> <td>Right-hand-side (RHS) occurs as a subsequence of left-hand-side (LHS)</td> <td>No</td> <td>"FabriKam" contains "BRik"</td> </tr> <tr> <td>has</td> <td>RHS is a whole term in LHS</td> <td>No</td> <td>"North America" has "america"</td> </tr> <tr> <td>has_all</td> <td>Same as has but works on all of the elements</td> <td>No</td> <td>"North and South America" has_all("south", "north")</td> </tr> <tr> <td>has_any</td> <td>Same as has but works on any of the elements</td> <td>No</td> <td>"North America" has_any("south", "north")</td> </tr> <tr> <td>in</td> <td>Equals to any of the elements</td> <td>Yes</td> <td>"abc" in ("123", "345", "abc")</td> </tr> </tbody> </table> <p>While performing their day-to-day tasks, EvolveLabs employees send and receive emails. A record of each of these emails is stored in the <strong>Email</strong> table.</p> <blockquote> <p>🎯<strong>Key Point – User Privacy and Metadata</strong>: As you can imagine, some emails are highly sensitive. Instead of storing the entire contents of every email sent and received within the company in a database that can be easily accessed by security analysts, we only capture email metadata. </p> </blockquote> <p>Email metadata includes information like: the time the email was sent, the sender, the recipient, the subject line, and any links the email may contain. Storing only email metadata, rather than entire contents, helps protect the privacy of our employees, while also ensuring that our security analysts can keep us safe. Sometimes even metadata can reveal sensitive information, so it’s important that you don’t talk about log data with other employees outside the SOC.</p> <p>We can find information about the emails sent or received by a user by looking for their email address in the sender and recipient fields of the <strong>Email</strong> table. For example, we can use the following query to see all the emails sent by “Michael Montello”:</p> <div class="codehilite"><pre><span></span><code><span class="n">Email</span> <span class="o">|</span><span class="w"> </span><span class="k">where</span><span class="w"> </span><span class="n">sender</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="ss">"michael_montello@envolvelabs.com"</span> </code></pre></div> <blockquote> <ol> <li>🤔 How many emails did Stuart Carter receive?</li> </ol> </blockquote> <p><strong>Easy as 1, 2, 3… Compound Queries and the distinct Operator</strong></p> <p>We can use the <strong>distinct</strong> operator to find unique values in a particular column. We can use the following query to determine how many of the organization’s users sent emails.</p> <div class="codehilite"><pre><span></span><code><span class="n">Email</span> <span class="o">|</span><span class="w"> </span><span class="k">where</span><span class="w"> </span><span class="n">sender</span><span class="w"> </span><span class="n">has</span><span class="w"> </span><span class="ss">"envolvelabs"</span> <span class="o">|</span><span class="w"> </span><span class="k">distinct</span><span class="w"> </span><span class="n">sender</span> <span class="o">|</span><span class="w"> </span><span class="k">count</span> </code></pre></div> <p>This is our first time using a multi-line query with multiple operators, so let’s break it down:</p> <p>In line 2, we take the Email table and filter the data down to find only those rows with "envolvelabs" in the sender column.</p> <p>In line 3, we add another pipe character ( | ) and use the distinct operator to find all the unique senders. Here, we aren’t finding the unique senders for all of the email senders, but only the unique senders that are left after we apply the filter looking for rows with "envolvelabs" in the sender column.</p> <p>Finally, in line 4, we add another pipe character ( | ) and then use the count operator to count the results of lines 1-3 of the query.</p> <ol> <li>🤔 How many distinct senders were seen in the email logs from siusamteas.com?</li> </ol> <p><strong>Tracking Down a Click: OutboundNetworkEvents Data</strong></p> <p>When employees at EvolveLabs browse to a website from within the corporate network, that browsing activity is logged. This is stored in the <strong>OutboundNetworkEvents</strong> table, which contains records of the websites browsed by each user in the company. Whenever someone visits a website, a record of it is stored in the table. However, the user’s name is not stored in the table, only their IP address is recorded. There is a 1:1 relationship between users and their assigned IP addresses, so we can reference the <strong>Employees</strong> table to figure out who browsed a particular website. When a user visits a site, sometimes data from a lot of other sources are loaded as well. For example, images, assets, and other content may be hosted on content delivery network (CDN), which is used to deliver and load content quickly on a website. Sometimes, advertisements will also load from a particular website as well.</p> <p>If we want to figure out what websites Annie Jackson visited, we can find their IP address from the Employees table.</p> <div class="codehilite"><pre><span></span><code><span class="n">Employees</span> <span class="o">|</span><span class="w"> </span><span class="k">where</span><span class="w"> </span><span class="n">name</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="ss">"Annie Jackson"</span> </code></pre></div> <p>The query above tells us their IP address is “192.168.0.94”. We can take their IP address and look in the <strong>OutboundNetworkEvents</strong> table to determine what websites they visited.</p> <div class="codehilite"><pre><span></span><code><span class="n">OutboundNetworkEvents</span> <span class="o">|</span><span class="w"> </span><span class="k">where</span><span class="w"> </span><span class="n">src_ip</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="ss">"192.168.3.168"</span> </code></pre></div> <blockquote> <ol> <li>🤔 How many unique websites did “Keith Mitchell” visit?</li> </ol> </blockquote> <p><strong>What’s in a Name? All about Passive DNS Data</strong></p> <p>Although domain names like “google.com” are easy for humans to remember, computers don’t know how to handle them. So, they convert them to machine readable IP addresses. Just like your home address tells your friends how to find your house or apartment, an IP address tells your computer where to find a page or service hosted on the internet.</p> <blockquote> <p>🎯<strong>Key Point – Practice Good OPSEC</strong>: If we want to find out which IP address a particular domain resolves to, we could just browse to it. But, if the domain is a malicious one, you could download malicious files to your corporate analysis system or tip off the attackers that you know about their infrastructure. As cybersecurity analysts, we must follow procedures and safeguards that protect our ability to track threats. These practices are generally called operational security, or OPSEC.</p> </blockquote> <p>To eliminate the need to actively resolve (that is- directly browse to or interact with a domain to find it’s related IP address) every domain we’re interested in, we can rely on passive DNS data. Passive DNS data allows us to safely explore domain-to-IP relationships, so we can answer questions like:</p> <ul> <li><em>Which IP address does this domain resolve to?</em></li> <li><em>Which domains are hosted on this IP address?</em></li> <li><em>How many other IPs have this domain resolved to?</em></li> </ul> <p>These domain-to-IP relationships are stored in our <strong>PassiveDns</strong> table.</p> <blockquote> <ol> <li>🤔 How many domains in the PassiveDns records contain the word "vaccine"? (hint: use the contains operator instead of has. If you get stuck, do a take 10 on the table to see what fields are available.)</li> <li>🤔 What IPs did the domain “biotechenvolv.science” resolve to?</li> </ol> </blockquote> <p><strong>🤯 Let statements – making your life a bit easier:</strong></p> <p>Sometimes we need to use the output of one query as the input for a second query. The first way we can do this is by manually typing the results into the next query.</p> <p>For example, what if we want to look at all the web browsing activity from employees named "Linda"?</p> <p>First, you would need to go into the <strong>Employees</strong> table and find the IP addresses used by these employees.</p> <div class="codehilite"><pre><span></span><code><span class="n">Employees</span> <span class="o">|</span><span class="w"> </span><span class="k">where</span><span class="w"> </span><span class="n">name</span><span class="w"> </span><span class="n">has</span><span class="w"> </span><span class="ss">"Linda"</span> </code></pre></div> <p><img alt="image" src="https://github.com/Squiblydoo/kc7_data/assets/77356206/5a383622-844e-4ba3-88c4-27027cc7a8eb" /></p> <p>Then, you could manually copy and paste these IPs into a query against the <strong>OutboundNetworkEvents</strong> table. Note that we can use the in operator to choose all rows that have a value matching any value from a list of possible values. In other words, the == (comparison) operator looks for an exact match, while the in operator checks for any values from the list.</p> <p><img alt="image" src="https://github.com/Squiblydoo/kc7_data/assets/77356206/bbcba4dd-490f-4460-a02f-59cccebec0bb" /></p> <p>Although this is a valid way to get the information you need, it may not be as elegant (or timely) if you had 100 or even 1000 employees named “James.”</p> <p>We can accomplish this in a more elegant way by using a <a href="https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/letstatement">let statement,</a> which allows us to assign a name to an expression or a function. We can use a let statement here to save and give a name to the results of the first query so that the values can be re-used later. That means we don’t have to manually type or copy and paste the results repeatedly.</p> <div class="codehilite"><pre><span></span><code><span class="n">let</span><span class="w"> </span><span class="n">linda_ips</span><span class="w"> </span><span class="o">=</span><span class="w"> </span> <span class="n">Employees</span> <span class="o">|</span><span class="w"> </span><span class="k">where</span><span class="w"> </span><span class="n">name</span><span class="w"> </span><span class="n">has</span><span class="w"> </span><span class="ss">"Linda"</span> <span class="o">|</span><span class="w"> </span><span class="k">distinct</span><span class="w"> </span><span class="n">ip_addr</span><span class="p">;</span> <span class="n">OutboundNetworkEvents</span> <span class="o">|</span><span class="w"> </span><span class="k">where</span><span class="w"> </span><span class="n">src_ip</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="p">(</span><span class="n">linda_ips</span><span class="p">)</span> </code></pre></div> <p>On the left of the let statement is the variable name ("linda_ips" in this case). The variable name can be whatever we want, but it is helpful to make it something meaningful that can help us remember what values it is storing.</p> <div class="codehilite"><pre><span></span><code><span class="hll"><span class="n">let</span><span class="w"> </span><span class="n">linda_ips</span><span class="w"> </span><span class="o">=</span><span class="w"> </span> </span><span class="n">Employees</span> <span class="o">|</span><span class="w"> </span><span class="k">where</span><span class="w"> </span><span class="n">name</span><span class="w"> </span><span class="n">has</span><span class="w"> </span><span class="ss">"Linda"</span> <span class="o">|</span><span class="w"> </span><span class="k">distinct</span><span class="w"> </span><span class="n">ip_addr</span><span class="p">;</span> <span class="n">OutboundNetworkEvents</span> <span class="o">|</span><span class="w"> </span><span class="k">where</span><span class="w"> </span><span class="n">src_ip</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="p">(</span><span class="n">linda_ips</span><span class="p">)</span> </code></pre></div> <p>On the right side of the let statement in the expression you are storing. In this case, we use the <strong>distinct</strong> operator to select values from only one column – so they are stored in an array – or list of values.</p> <div class="codehilite"><pre><span></span><code><span class="n">let</span><span class="w"> </span><span class="n">linda_ips</span><span class="w"> </span><span class="o">=</span><span class="w"> </span> <span class="hll"><span class="n">Employees</span> </span><span class="hll"><span class="o">|</span><span class="w"> </span><span class="k">where</span><span class="w"> </span><span class="n">name</span><span class="w"> </span><span class="n">has</span><span class="w"> </span><span class="ss">"Linda"</span> </span><span class="hll"><span class="o">|</span><span class="w"> </span><span class="k">distinct</span><span class="w"> </span><span class="n">ip_addr</span><span class="p">;</span> </span><span class="n">OutboundNetworkEvents</span> <span class="o">|</span><span class="w"> </span><span class="k">where</span><span class="w"> </span><span class="n">src_ip</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="p">(</span><span class="n">linda_ips</span><span class="p">)</span> </code></pre></div> <p>The <strong>let</strong> statement is concluded by a semi-colon.</p> <div class="codehilite"><pre><span></span><code><span class="n">let</span><span class="w"> </span><span class="n">linda_ips</span><span class="w"> </span><span class="o">=</span><span class="w"> </span> <span class="n">Employees</span> <span class="o">|</span><span class="w"> </span><span class="k">where</span><span class="w"> </span><span class="n">name</span><span class="w"> </span><span class="n">has</span><span class="w"> </span><span class="ss">"Linda"</span> <span class="hll"><span class="o">|</span><span class="w"> </span><span class="k">distinct</span><span class="w"> </span><span class="n">ip_addr</span><span class="p">;</span> </span><span class="n">OutboundNetworkEvents</span> <span class="o">|</span><span class="w"> </span><span class="k">where</span><span class="w"> </span><span class="n">src_ip</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="p">(</span><span class="n">linda_ips</span><span class="p">)</span> </code></pre></div> <p>After we store the value of a query into a variable using the <strong>let</strong> statement, we can refer to it as many times as we like in the rest of the query. The stored query does not show any output. Remember, however, that your KQL query must have a tabular statement – which means that you must have another query following your <strong>let</strong> statement.</p> <blockquote> <ol> <li>🤔 How many unique URLs were browsed by employees named “Karen”?</li> </ol> <p>🎯 <strong>Key Point – Pivoting:</strong> Part of being a great cyber analyst is learning how to use multiple data sources to tell a more complete story of what an attacker has done. We call this “pivoting.” We pivot by taking one known piece of data in one dataset and looking in a different dataset to learn something we didn’t already know. You practiced this here when we started in one dataset – the Employees table – and used knowledge from there to find related data in another source – OutboundNetworkEvents.</p> </blockquote> <h2>Section 2: Start Hunting!</h2> <p>Now that you’ve completed your initial round of training, you are ready to work your first case in the SOC!</p> <p>A security researcher tweeted that the domain “immune.tech” was being used by hackers. Apparently the hackers are sending this domain inside credential phishing emails.</p> <p><img alt="image" src="https://github.com/Squiblydoo/kc7_data/assets/77356206/d9449bb5-8219-4e26-9aba-38aea06617f9" /></p> <p>⚠️ NOTE! This domain and others encountered in this game are fictional and are not representative of actual malicious activity!</p> <p>According to OSINT research your colleagues conducted, this domain may be used as part of a phishing campaign with the following stages: <img alt="image" src="https://github.com/Squiblydoo/kc7_data/assets/77356206/a63aec8b-5032-4557-9bc7-7e74696e19cf" /></p> <blockquote> <p>🎯 Key Point – Open Source Intelligence (OSINT): Security researchers and analysts often use free, publicly available data, like Twitter! We call this public data OSINT, and it can be a great way to get investigative leads. Like all public data sources on the internet, you should follow up any OSINT tip with rigorous analysis, rather than blindly trusting the source.</p> </blockquote> <p>🤔 Answer the following questions related to this tip: 1. Which users in our organization were sent emails containing the domain immune.tech? 2. Did we block any of the emails containing that domain? Who actually received one of these emails? (hint: the “accepted” field in the Email table tells you whether or not the email was blocked. Blocked emails will show as false). 3. What other domains shared the same IPs as immune.tech? Can you find the full list of domains associated with this actor based on PassiveDns data? (hint: you can use the in operator to check for multiple values in a field. E.g. where field in (“x”, “y”, “z”) 4. What email addresses did the hackers use to send these domains? 5. Did users click on any of the links in the phishing emails? 6. Did any user have their credentials stolen? How do you know?</p> <blockquote> <p>🤫 Hint: In order to have their credentials stolen, a user would need to browse to the credential harvesting site and enter their username and password. After this, the actor might try to login to the user’s account using the stolen credentials. You can find details about login activity in the AuthenticationEvents table.</p> </blockquote> <ol> <li>Did any user have contents exfiltrated (stolen) from their mailbox? How do you know? What risk is posed to the company by this content being stolen?</li> </ol> <h2>Section 3: Hackers sending malware docs</h2> <p>After digging for a bit on the phishing activity, you come across another tweet from a threat intelligence vendor SolitaryStrike: <img alt="image" src="https://github.com/Squiblydoo/kc7_data/assets/77356206/21da1ad7-4399-4d9d-8f11-ba264b89bd9a" /></p> <p>🤔 Use the tipper above to answer the following questions: You can optionally submit your answers to the scoreboard at https://kc7cyber.azurewebsites.net/ to get feedback and earn points. 1. How many emails contained the domain <code>notice.io</code>? How many emails contained the domain <code>notice.io</code>? 2. What email address sent the domain <code>notice.io</code> 3. What was the subject line of the emails containing the domain <code>notice.io</code>? 4. What is the name of the user who clicked on the <code>notice.io</code> link? 5. At what timestamp did the user above download the file: "Critical_Security_Path.docx"? 6. How many emails were sent to your organization on January 9th by users at wesellbeakers.com? 7. What other domains are hosted on the same IPs as <code>notice.io</code>? 8. What email address is seen sending emails containing one of the domains identified in question 7? 9. How many users downloaded the files observed in the emails from question 8? 10. One of the files observed in question 9 - IMPORTANT_INSTRUCTIONS.pptx - was seen in two separate emails. What are the subject lines of these emails? 11. Which compromised pharmasupplies.org email address was used to send a link to scanverify.com? 12. How many IPs has scanverify.com resolved to? 13. Consider the email address you found in question 11. What other domain did this email address send? 14. What is the name of the file hosted on scanverify.com? 15. Which .pptx file was used to target Gerald Kempinski and Kenny Salcido? 16. Which actor IP was used to search EnvolveLabs' website for the term "helpdesk ticket system"? 17. How many total emails were sent to your organization by this actor? 18. Which .dll file was dropped on a victim machine shortly after the user downloaded the malicious zip : EnvolveLabs_Research_Tool.7z</p> <blockquote> <p>🤫 Hint: Files that are created on employees’ devices are captured in the FileCreationEvents log. Try looking there to see which employees downloaded this file.</p> </blockquote> <ol> <li>Which six letter reconnaissance command was executed on the Machine of the user that loaded the implant above?</li> </ol> <blockquote> <p>🤫 Hint: Try narrowing down on one particular device that downloaded the EnvolveLabs_Research_Tool.7z file. Then, look in both the FileCreationEvents and ProcessEvents logs to find new files and processes created around the time when the file was downloaded.</p> </blockquote> <ol> <li>A malicious file 'infector.exe' is observed performing suspicious actions on multiple devices. What process_commandline associated with this file is being used for persistence on the devices?</li> </ol> <blockquote> <p>🤫 Hint: Actors establish persistence so they can come back later and conduct manual tasks (called hands-on-keyboard activity) within your company’s network. Try looking for systems creating connections to external domains and IPs, or unusual behaviors like creation of scheduled tasks.</p> </blockquote> <h2>Glossary</h2> <p><strong>Watering hole:</strong> A type of attack where a hacker compromises a website that is frequently visited by a specific group of users, such as employees of a certain organization, and then infects their devices with malware when they visit the site.</p> <p><strong>Phishing:</strong> A type of attack where a hacker sends an email or other message that appears to be from a legitimate source, such as a bank or a colleague, and tries to trick the recipient into clicking on a malicious link, opening an attachment, or providing sensitive information.</p> <p><strong>Credential theft:</strong> A type of attack where a hacker steals or obtains the username and password of a user or an administrator, and then uses them to access their accounts or systems.</p> <p><strong>Password spray:</strong> A type of attack where a hacker tries to guess the passwords of multiple accounts by using common or weak passwords, such as “password” or “123456”, instead of targeting one account with many password attempts.</p> <p><strong>Reconnaissance:</strong> The process of gathering information about a target system, network, organization, or user before launching an attack. This can include scanning for open ports, identifying vulnerabilities, mapping network topology, collecting email addresses, etc.</p> <p><strong>Supply chain compromise:</strong> A type of attack where a hacker infiltrates the software development process or distribution channel of a trusted vendor or partner and inserts malicious code into their products or services. This way, the hacker can compromise the customers or users who install or use those products or services.</p> <p><strong>Malware:</strong> A general term for any software that is designed to harm or disrupt a system, network, device, or data. Malware can include viruses, worms, trojans</p> <p><strong>Command and control:</strong> A type of server or network that is used by hackers to communicate with and control their malware or botnets on compromised systems, networks, or devices. Command and control servers can send commands, receive data, update malware, or launch attacks.</p> <p><strong>Adversary in the middle:</strong> A type of attack where a hacker intercepts and modifies the communication between two parties, such as a user and a website, without their knowledge. Adversary in the middle attacks can be used to steal or alter data, redirect traffic, inject malware, or impersonate either party.</p> <p><strong>Top level domain:</strong> The highest level of domain names in the Internet’s domain name system. Top level domains are the last part of a domain name after the dot, such as .com, .org, .edu, etc. Top level domains can indicate the type or purpose of a website or its geographic location.</p> <h2>Resources</h2> <p>Understanding KQL operators: <a href="https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/datatypes-string-operators">https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/datatypes-string-operators</a></p>