, dealing with streaming information was thought-about an avant-garde method. Because the introduction of relational database administration techniques within the Nineteen Seventies and conventional information warehousing techniques within the late Eighties, all information workloads started and ended with the so-called batch processing. Batch processing depends on the idea of gathering quite a few duties in a gaggle (or batch) and processing these duties in a single operation.
On the flip facet, there’s a idea of streaming information. Though streaming information remains to be generally thought-about a cutting-edge know-how, it already has a strong historical past. Every part began in 2002, when Stanford College researchers printed the paper known as “Fashions and Points in Knowledge Stream Programs”. Nonetheless, it wasn’t till virtually a decade later (2011) that streaming information techniques began to achieve a wider viewers, when the Apache Kafka platform for storing and processing streaming information was open-sourced. The remaining is historical past, as individuals say. These days, processing streaming information isn’t thought-about a luxurious however a necessity.
Microsoft acknowledged the rising have to course of the info “as quickly because it arrives”. Therefore, Microsoft Cloth doesn’t disappoint in that regard, as Actual-time Intelligence is on the core of the complete platform and presents an entire vary of capabilities to deal with streaming information effectively.
Earlier than we dive deep into explaining every element of Actual-time Intelligence, let’s take one step again and take a extra tool-agnostic method to stream processing basically.
What’s stream processing?
For those who enter the phrase from the part title in Google Search, you’ll get greater than 100,000 outcomes! Subsequently, I’m sharing an illustration that represents our understanding of stream processing.
Let’s now study typical use instances for stream processing:
- Fraud detection
- Actual-time inventory trades
- Buyer exercise
- Log monitoring — troubleshooting techniques, units, and so on.
- Safety info and occasion administration — analyzing logs and real-time occasion information for monitoring and menace detection
- Warehouse stock
- Journey share matching
- Machine studying and predictive analytics
As you might have seen, streaming information has develop into an integral a part of quite a few real-life eventualities and is taken into account vastly superior to conventional batch processing for the aforementioned use instances.
Let’s now discover how streaming information processing is carried out in Microsoft Cloth and which instruments of commerce now we have at our disposal.
The next illustration exhibits the high-level overview of all Actual-time Intelligence elements in Microsoft Cloth:

Actual-Time hub
Let’s kick it off by introducing a Actual-Time hub. Each Microsoft Cloth tenant robotically provisions a Actual-Time hub. It is a point of interest for all data-in-motion throughout the complete group. Much like OneLake, there may be one, and just one, Actual-Time hub per tenant — this implies, you possibly can’t provision or create a number of Actual-Time hubs.
The primary function of the Actual-Time hub is to allow fast and straightforward discovery, ingestion, administration, and consumption of streaming information from a variety of sources. Within the following illustration, you’ll find the overview of all the info streams within the Actual-Time hub in Microsoft Cloth:

Let’s now discover all of the obtainable choices within the Actual-Time hub.
- All information streams tab shows all of the streams and tables you possibly can entry. Streams symbolize the output from Cloth eventstreams, whereas tables come from KQL databases. We are going to discover each evenstreams and KQL databases in additional element within the following sections
- My information streams tab exhibits all of the streams you introduced into Microsoft Cloth into My workspace
- Knowledge sources tab is on the core of bringing the info into Cloth, each from inside and outdoors. As soon as you end up within the Knowledge sources tab, you possibly can select between quite a few, out-of-the-box supplied connectors, equivalent to Kafka, CDC streams for numerous database techniques, exterior cloud options like AWS and GCP, and plenty of extra
- Microsoft sources tab filters out the earlier set of sources to incorporate Microsoft information sources solely
- Cloth occasions tab shows the checklist of system occasions generated in Microsoft Cloth you could entry. Right here, you could select between Job occasions, OneLake occasions, and Workspace merchandise occasions. Let’s dive into every of those three choices:
- Job occasions are occasions produced by standing adjustments on Cloth monitor actions, equivalent to job created, succeeded, or failed
- OneLake occasions symbolize occasions produced by actions on information and folders in OneLake, equivalent to file created, deleted, or renamed
- Workspace merchandise occasions are produced by actions on workspace objects, equivalent to merchandise created, deleted, or renamed
- Azure occasions tab exhibits the checklist of system occasions generated in Azure blob storage
The Actual-Time hub gives numerous connectors for ingesting the info into Microsoft Cloth. It additionally allows creating streams for the entire supported sources. After the stream is created, you possibly can course of, analyze, and act on them.
- Processing a stream permits you to apply quite a few transformations, equivalent to mixture, filter, union, and plenty of extra. The aim is to rework the info earlier than you ship the output to supported locations
- Analyzing a stream lets you add a KQL database as a vacation spot of the stream, after which open the KQL Database and execute queries in opposition to the database.
- Performing on streams assumes setting the alerts primarily based on situations and specifying actions to be taken when sure situations are met
Eventstreams
For those who’re a low-code or no-code information skilled and you could deal with streaming information, you’ll love Eventstreams. In a nutshell, Eventstream permits you to hook up with quite a few information sources, which we examined within the earlier part, optionally apply numerous information transformation steps, and at last output outcomes into a number of locations. The next determine illustrates a typical workflow for ingesting streaming information into three totally different locations — Eventhouse, Lakehouse, and Activator:

Inside the Eventstream settings, you possibly can regulate the retention interval for the incoming information. By default, the info is retained for sooner or later, and occasions are robotically eliminated when the retention interval expires.
Apart from that, you might also need to fine-tune the occasion throughput for incoming and outgoing occasions. There are three choices to select from:
- Low: < 10 MB/s
- Medium: 10-100 MB/s
- Excessive: > 100 MB/s
Eventhouse and KQL database
Within the earlier part, you’ve discovered how to connect with numerous streaming information sources, optionally rework the info, and at last load it into the ultimate vacation spot. As you may need seen, one of many obtainable locations is the Eventhouse. On this part, we’ll discover Microsoft Cloth objects used to retailer the info inside the Actual-Time Intelligence workload.
Eventhouse
We’ll first introduce the Eventhouse merchandise. The Eventhouse is nothing else however a container for KQL databases. Eventhouse itself doesn’t retailer any information — it merely gives the infrastructure inside the Cloth workspace for coping with streaming information. The next determine shows the System overview web page of the Eventhouse:

The beauty of the System overview web page is that it gives all the important thing info at a look. Therefore, you possibly can instantly perceive the operating state of the eventhouse, OneLake storage utilization, additional damaged down per particular person KQL database stage, compute utilization, most lively databases and customers, and up to date occasions.
If we change to the Databases web page, we will see a high-level overview of KQL databases which are a part of the prevailing Eventhouse, as proven beneath:

You’ll be able to create a number of eventhouses in a single Cloth workspace. Additionally, a single eventhouse might comprise a number of KQL databases:

Let’s wrap up the story in regards to the Eventhouse by explaining the idea of Minimal consumption. By design, the Eventhouse is optimized to auto-suspend providers when not in use. Subsequently, when these providers are reactivated, it’d take a while for the Eventhouse to be absolutely obtainable once more. Nonetheless, there are particular enterprise eventualities when this latency isn’t acceptable. In these eventualities, ensure to configure the Minimal consumption function. By configuring the Minimal consumption, the service is at all times obtainable, however you’re answerable for figuring out the minimal stage, which is then obtainable for KQL databases contained in the Eventhouse.
KQL database
Now that you just’ve discovered in regards to the Eventhouse container, let’s concentrate on analyzing the core merchandise for storing real-time analytics information — the KQL database.
Let’s take one step again and clarify the title of the merchandise first. Whereas most information professionals have a minimum of heard about SQL (which stands for Structured Question Language), I’m fairly assured that KQL is far more cryptic than its “structured” relative.
You may need rightly assumed that QL within the abbreviation stands for Question Language. However, what does this letter Okay symbolize? It’s an abbreviation for Kusto. I hear you, I hear you: what’s now Kusto?! Though the city legend says that the language was named after the well-known polymath and oceanographer Jacques Cousteau (his final title is pronounced “Kusto”), I couldn’t discover any official affirmation from Microsoft to substantiate this story. What is unquestionably recognized is that it was the interior mission title for the Log Analytics Question Language.
Once we discuss historical past, let’s share some extra historical past classes. For those who ever labored with Azure Knowledge Explorer (ADX) prior to now, you’re in luck. KQL database in Microsoft Cloth is the official successor of ADX. Much like many different Azure information providers that have been rebuilt and built-in into SaaS-fied nature of Cloth, ADX supplied platform for storing and querying real-time analytics information for KQL databases. The engine and core capabilities of the KQL database are the identical as in Azure Knowledge Explorer — the important thing distinction is the administration conduct: Azure Knowledge Explorer represents a PaaS (Platform-as-a-Service), whereas KQL database is a SaaS (Software program-as-a-Service) resolution.
Though you could retailer any information within the KQL database (non-structured, semi-structured, and structured), its important function is dealing with telemetry, logs, occasions, traces, and time sequence information. Below the hood, the engine leverages optimized storage codecs, automated indexing and partitioning, and superior information statistics for environment friendly question planning.
Let’s now study how one can leverage the KQL database in Microsoft Cloth to retailer and question real-time analytics information. Making a database is as easy because it may very well be. The next determine illustrates the 2-step course of of making a KQL database in Cloth:

- Click on on the “+” signal subsequent to KQL databases
- Present the database title and select its kind. Sort may be the default new database, or a shortcut database. Shortcut database is a reference to a special database that may be both one other KQL database in Actual-Time Intelligence in Microsoft Cloth, or an Azure Knowledge Explorer database
Don’t combine the idea of OneLake shortcuts with the idea of shortcut database kind in Actual-Time Intelligence! Whereas the latter merely references the complete KQL/Azure Knowledge Explorer database, OneLake shortcuts permit the usage of the info saved in Delta tables throughout different OneLake workloads, equivalent to lakehouses and/or warehouses, and even exterior information sources (ADLS Gen2, Amazon S3, Dataverse, Google Cloud Storage, to call a number of). This information can then be accessed from KQL databases through the use of the external_table() perform
Let’s now take a fast tour of the important thing options of the KQL database from the user-interface perspective. The determine beneath illustrates the details of curiosity:

- Tables – shows all of the tables within the database
- Shortcuts – exhibits tables created as OneLake shortcuts
- Materialized views – a materialized view represents the aggregation question over a supply desk or one other materialized view. It consists of a single summarize assertion
- Capabilities – these are Person-defined features saved and managed on a database stage, just like tables. These features are created through the use of the .create perform command
- Knowledge streams – all streams which are related for the chosen KQL database
- Knowledge Exercise Tracker – exhibits the exercise within the database for the chosen time interval
- Tables/Knowledge preview – allows switching between two totally different views. Tables shows the high-level overview of the database tables, whereas Knowledge preview exhibits the highest 100 data of the chosen desk
Question and visualize information in Actual-Time Intelligence
Now that you just’ve discovered how one can retailer real-time analytics information in Microsoft Cloth, it’s time to get our palms soiled and supply some enterprise perception out of this information. On this part, I’ll concentrate on explaining numerous choices for extracting helpful info from the info saved within the KQL database.
Therefore, on this part, I’ll introduce frequent KQL features for information retrieval, and discover Actual-time dashboards for visualizing the info.
KQL queryset
The KQL queryset is the material merchandise used to run queries and look at and customise outcomes from numerous information sources. As quickly as you create a brand new KQL database, the KQL queryset merchandise shall be provisioned out of the field. It is a default KQL queryset that’s robotically related to the KQL database below which it exists. The default KQL queryset doesn’t permit a number of connections.
On the flip facet, while you create a customized KQL queryset merchandise, you possibly can join it to a number of information sources, as proven within the following illustration:

Let’s now introduce the constructing blocks of the KQL and study a number of the mostly used operators and features. KQL is a reasonably easy but highly effective language. To some extent, it’s similar to SQL, particularly by way of utilizing schema entities which are organized in hierarchies, equivalent to databases, tables, and columns.
The most typical kind of KQL question assertion is a tabular expression assertion. Because of this each question enter and output encompass tables or tabular datasets. Operators in a tabular assertion are sequenced by the “|” (pipe) image. Knowledge is flowing (is piped) from one operator to the subsequent, as displayed within the following code snippet:
MyTable
| the place StartTime between (datetime(2024-11-01) .. datetime(2024-12-01))
| the place State == "Texas"
| rely
The piping is sequential — the info is flowing from one operator to a different — which means that the question operator order is vital and should have an effect on each the output outcomes and efficiency.
Within the above code instance, the info in MyTable is first filtered on the StartTime column, then filtered on the State column, and at last, the question returns a desk containing a single column and single row, displaying the rely of the filtered rows.
The truthful query at this level can be: what if I already know SQL? Do I have to study one other language only for the sake of querying real-time analytics information? The reply is as traditional: it relies upon.
Fortunately, I’ve good and nice information to share right here!
The excellent news is: you CAN write SQL statements to question the info saved within the KQL database. However, the truth that you can do one thing, doesn’t imply you ought to…Through the use of SQL-only queries, you’re lacking the purpose, and limitting your self from utilizing many KQL-specific features which are constructed to deal with real-time analytics queries in probably the most environment friendly method
The nice information is: by leveraging the clarify operator, you possibly can “ask” Kusto to translate your SQL assertion into an equal KQL assertion, as displayed within the following determine:

Within the following examples, we’ll question the pattern Climate dataset, which comprises information about climate storms and damages within the USA. Let’s begin easy after which introduce some extra advanced queries. Within the first instance, we’ll rely the variety of data within the Climate desk:
//Depend data
Climate
| rely
Questioning how one can retrieve solely a subset of data? You should use both take or restrict operator:
//Pattern information
Climate
| take 10
Please remember the fact that the take operator won’t return the TOP n variety of data, except your information is sorted within the particular order. Usually, the take operator returns any n variety of data from the desk.
Within the subsequent step, we need to prolong this question and return not solely a subset of rows, but additionally a subset of columns:
//Pattern information from a subset of columns
Climate
| take 10
| mission State, EventType, DamageProperty
The mission operator is the equal of the SELECT assertion in SQL. It specifies which columns must be included within the end result set.
Within the following instance, we’re making a calculated column, Period, that represents a period between EndTime and StartTime values. As well as, we need to show solely prime 10 data sorted by the DamageProperty worth in descending order:
//Create calculated columns
Climate
| the place State == 'NEW YORK' and EventType == 'Winter Climate'
| prime 10 by DamageProperty desc
| mission StartTime, EndTime, Period = EndTime - StartTime, DamageProperty
It’s the correct second to introduce the summarize operator. This operator produces a desk that aggregates the content material of the enter desk. Therefore, the next assertion will show the full variety of data per every state, together with solely the highest 5 states:
//Use summarize operator
Climate
| summarize TotalRecords = rely() by State
| prime 5 by TotalRecords
Let’s develop on the earlier code and visualize the info straight within the end result set. I’ll add one other line of KQL code to render outcomes as a bar chart:

As you could discover, the chart may be moreover custom-made from the Visible formatting pane on the right-hand facet, which gives much more flexibility when visualizing the info saved within the KQL database.
These have been simply primary examples of utilizing KQL language to retrieve the info saved within the Eventhouse and KQL databases. I can guarantee you that KQL received’t allow you to down in additional superior use instances when you could manipulate and retrieve real-time analytics information.
I perceive that SQL is the “Lingua franca” of many information professionals. And though you possibly can write SQL to retrieve the info from the KQL database, I strongly encourage you to chorus from doing this. As a fast reference, I’m offering you with a “SQL to KQL cheat sheet” to provide you a head begin when transitioning from SQL to KQL.
Additionally, my buddy and fellow MVP Brian Bønk printed and maintains a incredible reference information for the KQL language right here. Be sure to provide it a attempt in case you are working with KQL.
Actual-time dashboards
Whereas KQL querysets symbolize a strong method of exploring and querying information saved in Eventhouses and KQL databases, their visualization capabilities are fairly restricted. Sure, you can visualize leads to the question view, as you’ve seen in one of many earlier examples, however that is extra of a “first support” visualization that received’t make your managers and enterprise decision-makers comfortable.
Happily, there may be an out-of-the-box resolution in Actual-Time Intelligence that helps superior information visualization ideas and options. Actual-Time Dashboard is a Cloth merchandise that permits the creation of interactive and visually interesting business-reporting options.
Let’s first establish the core parts of the Actual-Time Dashboard. A dashboard consists of a number of tiles, optionally structured and arranged in pages, the place every tile is populated by the underlying KQL question.
As a primary step within the course of of making Actual-Time Dashboards, this setting have to be enabled within the Admin portal of your Cloth tenant:

Subsequent, you need to create a brand new Actual-Time Dashboard merchandise within the Cloth workspace. From there, let’s hook up with our Climate dataset and configure our first dashboard tile. We’ll execute one of many queries from the earlier part to retrieve the highest 10 states with the conditional rely perform. The determine beneath exhibits the tile settings panel with quite a few choices to configure:

- KQL question to populate the tile
- Visible illustration of the info
- Visible formatting pane with choices to set the tile title and outline
- Visible kind drop-down menu to pick the specified visible kind (in our case, it’s desk visible)
Let’s now add two extra tiles to our dashboard. I’ll copy and paste two queries that we beforehand used — the primary will retrieve the highest 5 states per complete variety of data, whereas the opposite will show the harm property worth change over time for the state of New York and for occasion kind, which equals winter climate.

You may also add a tile straight from the KQL queryset to the prevailing dashboard, as illustrated beneath:

Let’s now concentrate on the varied capabilities you’ve gotten when working with Actual-Time Dashboards. Within the prime ribbon, you’ll discover choices so as to add a New information supply, set a brand new parameter, and add base queries. Nonetheless, what actually makes Actual-Time Dashboards highly effective is the chance to set alerts on a Actual-Time Dashboard. Relying if the situations outlined within the alert are met, you possibly can set off a selected motion, equivalent to sending an e-mail or Microsoft Groups message. An alert is created utilizing the Activator merchandise.

Visualize information with Energy BI
Energy BI is a mature and broadly adopted software for constructing strong, scalable, and interactive enterprise reporting options. On this part, we particularly concentrate on analyzing how Energy BI works in synergy with the Actual-Time Intelligence workload in Microsoft Cloth.
Making a Energy BI report primarily based on the info saved within the KQL database couldn’t be simpler. You’ll be able to select to create a Energy BI report straight from the KQL queryset, as displayed beneath:

Every question within the KQL queryset represents a desk within the Energy BI semantic mannequin. From right here, you possibly can construct visualizations and leverage all the prevailing Energy BI options to design an efficient, visually interesting report.
Clearly, you possibly can nonetheless leverage the “common” Energy BI workflow, which assumes connecting from the Energy BI Desktop to a KQL database as a knowledge supply. On this case, you could open a OneLake information hub and choose KQL Databases as a knowledge supply:

The identical as for SQL-based information sources, you possibly can select between the Import and DirectQuery storage modes to your real-time analytics information. Import mode creates a neighborhood copy of the info in Energy BI’s database, whereas DirectQuery allows querying the KQL database in near-real-time.
Activator
Activator is among the most modern options in the complete Microsoft Cloth realm. I’ll cowl Activator intimately in a separate article. Right here, I simply need to introduce this service and briefly emphasize its important traits.
Activator is a no-code resolution for robotically taking actions when situations within the underlying information are met. Activator can be utilized along side Eventstreams, Actual-Time Dashboards, and Energy BI stories. As soon as the info hits a sure threshold, the Activator robotically triggers the desired motion — for instance, sending the e-mail or Microsoft Groups message, and even firing Energy Automate flows. I’ll cowl all these eventualities in additional depth in a separate article, the place I additionally present some sensible eventualities for implementing the Activator merchandise.
Conclusion
Actual-Time Intelligence — one thing that began as part of the “Synapse expertise” in Microsoft Cloth, is now a separate, devoted workload. That tells us quite a bit about Microsoft’s imaginative and prescient and roadmap for Actual-Time Intelligence!
Don’t overlook: initially, Actual-Time Analytics was included below the Synapse umbrella, along with Knowledge Engineering, Knowledge Warehousing, and Knowledge Science experiences. Nonetheless, Microsoft thought that dealing with streaming information deserves a devoted workload in Microsoft Cloth, which completely is smart contemplating the rising have to take care of information in movement and supply perception from this information as quickly as it’s captured. In that sense, Microsoft Cloth gives an entire suite of highly effective providers, as the subsequent technology of instruments for processing, analyzing, and performing on information because it’s generated.
I’m fairly assured that the Actual-Time Intelligence workload will develop into an increasing number of important sooner or later, contemplating the evolution of knowledge sources and the growing tempo of knowledge technology.
Thanks for studying!