KQL – The mother of all languages in Azure
KQL, the Kusto Query Language, is used in many Microsoft services, including the Azure and Microsoft Security platforms such as the ATP family.
This post is part of a series of blog posts about KQL. It will provide you the basic and advanced information such as the syntax of KQL, then cover advanced topics such as machine learning and time series analysis, as well as exporting your data to various platforms.
Kusto Query Language is the PowerShell of Query Languages
Kusto is a service for storing and running interactive analytics over Big Data.
It is based on relational database management systems, supporting databases, tables, and columns. It provides complex analytics query operators, such as calculated columns, searching and filtering or rows, group by-aggregates, joins.
Kusto offers excellent data ingestion and query performance by “sacrificing” the ability to perform in-place updates of individual rows and cross-table constraints or transactions. Therefore, it supplants, rather than replaces, traditional RDBMS systems for OLTP and data warehousing scenarios.
As a Big Data service, Kusto handles structured, semi-structured (JSON-like nested types), and unstructured, including free-text data, equally well.
Interacting with Kusto – The main way for users to interact with Kusto is by using one of the many client tools available for Kusto. While SQL queries to Kusto are supported, the primary means of interaction with Kusto is using the Kusto query language to send data queries and use control commands to manage Kusto entities, discover metadata, etc. Both queries and control commands are short textual “programs.”
A Kusto query is a read-only request to process Kusto data and return the processing results without modifying the Kusto data or metadata. Kusto queries can use the SQL language or the Kusto query language. As an example for the latter, the following query counts how many rows in the logs table has the value of the Level column equals the string Critical.
Control commands – Control commands are requests to Kusto to process and potentially modify data or metadata. For example, the following control command creates a new Kusto table with two columns, Level and Text.
Control commands have syntax (which is not part of the Kusto query language syntax, although the two share many concepts). In particular, control commands are distinguished from queries by having the first character in the command’s text be the dot.
Kusto Query Language (KQL)
A Kusto query is a read-only request to process data and return results. The request is in plain text, using a data-flow model to make the syntax easy to read, author, and automate. The query uses schema entities organized in a hierarchy similar to SQL’s: databases, tables, and columns.
The query consists of a sequence of query statements, delimited by a semicolon (;), with at least one statement being a tabular expression statement, which is a statement that produces data arranged in a table-like mesh of columns and rows. The query’s tabular expression statements produce the results of the query.
Kusto Query Language (KQL) is very straightforward and easy to learn. It’s the same language used in Azure Log Analytics and Application Insights so that you won’t have any issues already using it there.
Your query starts easily regarding the table. You use this tabular data and run it through a set of statements connected by pipes to shape your data.
Below is an example from the Microsoft KQL documentation.
| where Process in (Top5Processes)
| summarize count() by bin (TimeGenerated, 1h), Process
| render time chart
How does it work?
Examples for ADE
- Azure Data Explorer is a distributed cluster instance provisioned from Azure. It’s maintained and operated by Microsoft.
- In your cluster, you need to host a database with tables. ADE does allow for dynamic column types meaning you can add just about anything into your tables.
- You can use Azure Active Directory to assign permissions to clusters, databases, and tables in terms of security.
- There are two methods to ingest data: queued ingestion, where data is ingested asynchronously, or direct ingestion, where data is pushed into ADE immediately.
- Use KQL to query your data.
- Visualize it either through the ‘render’ command in KQL or through PowerBI
Products Using Kusto Query Language
Azure Security Center via Log Analytics – https://docs.microsoft.com/en-us/azure/security-center/security-center-search
Application Insights –https://blogs.msdn.microsoft.com/bharry/2016/03/28/introducing-application-analytics/
CMPivot uses a subset of the Log Analytics language – https://docs.microsoft.com/en-us/sccm/core/servers/manage/cmpivot
Log Analytics suggests several monitoring solutions across Azure and on-prem, as add-ons that allow you to ingest log and metric data into your workspace.
Note: Currently, Azure Security Center, Log Analytics, and Application Insights data can be queried from a Log Analytics workspace.
Ad-Hoc Query tools
- Kusto.Explorer – The primary desktop tool for querying and controlling Kusto
- Web UI – Web UI for querying Kusto
Visualizations, Dashboards, and reporting tools
- Azure Notebooks – Query Kusto from an Azure Notebooks notebook
- Excel blank query – Add Kusto query as an Excel data source.
- Excel connector – an Excel connector for Azure Data Explorer
- PowerBI – Build reports and dashboards in Power BI on top of Azure Data Explorer data.
- PowerBI Connector – a Power BI Desktop connector for Azure Data Explorer
- Grafana – an analytics platform that enables creating and sharing dashboards
- Microsoft Flow – Execute Kusto queries automatically as part of Microsoft Flow.
- Microsoft Logic App – Execute Kusto queries automatically as part of the Microsoft Logic App
Data ingestion tools
- LightIngest – Help utility for ad-hoc data ingesting into Azure Data Explorer
- Azure Data Factory – Copy data from source data stores to Azure Data Explorer and from Azure Data Explorer to any supported sink datastore
Source control integration tools
- Azure Pipelines – Invokes control commands as part of your pipeline.
- Sync Kusto – Sync Kusto stored functions to/from Git
KQL and Azure Sentinel
Advanced Machine Learning capabilities built-in into Azure Sentinel can detect a threat’s indicative behaviors and help security analysts learn the expected behavior in their enterprise.
Also, Azure Sentinel provides out-of-the-box detection queries that leverage the Machine Learning capabilities of Kusto query language to detect suspicious behaviors such as abnormal traffic in firewall data, suspicious authentication patterns, and resource creation anomalies.
Query Accounts from Unusual large number of locations
A typical organization may have many users and many applications using Azure Active Directory for authentication. Some applications may have many more authentications than others and thus dominate the data.
Users may also have a different location profile depending on the application. For example, high location variability for email access may be expected, but less so for development activity associated with Visual Studio authentications.
The ability to track location variability for every user/application combination and then investigate just some of the most unusual cases can be achieved by leveraging the built-in query capabilities using the operator’s make-series and series_fit_line.
| where TimeGenerated >= ago(30d)
| extend locationString= strcat(tostring(LocationDetails[“countryOrRegion”]), “/”, tostring(LocationDetails[“state”]), “/”, tostring(LocationDetails[“city”]), “;”)
| project TimeGenerated, AppDisplayName , UserPrincipalName, locationString
| make-series dLocationCount = dcount(locationString) on TimeGenerated in range(startofday(ago(30d)),now(), 1d)
by UserPrincipalName, AppDisplayName
| extend (RSquare,Slope,Variance,RVariance,Interception,LineFit)=series_fit_line(dLocationCount)
| where Slope >0.3