Difference between revisions of "Define your own enrichment"

From khika
Jump to navigation Jump to search
(Created page with "=== Introduction === Enrichment, as the word suggests, can be used to add context to the streaming data. At its simplest level, you can enrich the data at run time by referrin...")
 
Line 1: Line 1:
 
=== Introduction ===
 
=== Introduction ===
Enrichment, as the word suggests, can be used to add context to the streaming data. At its simplest level, you can enrich the data at run time by referring to external an CSV file. Some of the example could be as under
+
Enrichment, as the word suggests, can be used to add context to the streaming data. At its '''simplest or basic level''', you can enrich the data at run time by referring to external an CSV file. Some of the example could be as under
 
* A csv file can contain information about the inventory (such as name of computer, location, owner, service tag) with name of computer as primary key. You can use this information to enrich the windows AD logs to add more context to the login information
 
* A csv file can contain information about the inventory (such as name of computer, location, owner, service tag) with name of computer as primary key. You can use this information to enrich the windows AD logs to add more context to the login information
 
* If you have a CSV database of IPs with bad reputation with IP address as primary key and country, city etc as other columns, you can very will refer to this information to streaming filewall logs to enrich any communication with bad IPs
 
* If you have a CSV database of IPs with bad reputation with IP address as primary key and country, city etc as other columns, you can very will refer to this information to streaming filewall logs to enrich any communication with bad IPs
Line 6: Line 6:
 
There could be several more examples how you can use static CSV based enrichment. You can change these CSV file dynamically and KHIKA will consume it immediately.
 
There could be several more examples how you can use static CSV based enrichment. You can change these CSV file dynamically and KHIKA will consume it immediately.
  
More advanced and real cool things about enrichment is it's ability to build the CSV database from a streaming data source and being able to use it in other data source for enrichment. Using this ability, you can literally correlate or stich the logs from different data sources at run time, if they have a field in common. Some of the examples could be as under
+
More '''advanced and real cool''' things about enrichment is KHIKA's ability to build the CSV database from a streaming data source and being able to use it in other data source for enrichment. Using this ability, you can literally '''correlate or stitch'''''Italic text'' the logs from different data sources at run time, provided they have a field in common. Some of the examples could be as under
 +
* We can build IP and username database at runtime using AD logs with IP address as the primary key (event 4624 can be used to extract these fields). Further, this database can be referred in Linux logins where AD user can be enriched as Linux logs would have IP address of login workstation, but not the AD username. (Linux usernames are different from AD user names)
 +
* We can extract session ID, IP address from Web logs with session ID as primary key and use it to enrich the IP address in application logs which had session ID but not IP of the client.
 +
 
 +
Lets us walk through examples, begging with simple enrichment using static CSV files.

Revision as of 13:08, 5 June 2019

Introduction

Enrichment, as the word suggests, can be used to add context to the streaming data. At its simplest or basic level, you can enrich the data at run time by referring to external an CSV file. Some of the example could be as under

  • A csv file can contain information about the inventory (such as name of computer, location, owner, service tag) with name of computer as primary key. You can use this information to enrich the windows AD logs to add more context to the login information
  • If you have a CSV database of IPs with bad reputation with IP address as primary key and country, city etc as other columns, you can very will refer to this information to streaming filewall logs to enrich any communication with bad IPs

There could be several more examples how you can use static CSV based enrichment. You can change these CSV file dynamically and KHIKA will consume it immediately.

More advanced and real cool things about enrichment is KHIKA's ability to build the CSV database from a streaming data source and being able to use it in other data source for enrichment. Using this ability, you can literally correlate or stitchItalic text the logs from different data sources at run time, provided they have a field in common. Some of the examples could be as under

  • We can build IP and username database at runtime using AD logs with IP address as the primary key (event 4624 can be used to extract these fields). Further, this database can be referred in Linux logins where AD user can be enriched as Linux logs would have IP address of login workstation, but not the AD username. (Linux usernames are different from AD user names)
  • We can extract session ID, IP address from Web logs with session ID as primary key and use it to enrich the IP address in application logs which had session ID but not IP of the client.

Lets us walk through examples, begging with simple enrichment using static CSV files.