Difference between revisions of "Data Archival in KHIKA"
Line 42: | Line 42: | ||
=== If you have implemented KHIKA on premise === | === If you have implemented KHIKA on premise === | ||
− | You have to provide storage, for data to move out from online | + | You have to provide storage, for data to move out from online to offline storage after retention period expires. |
− | As explained in above section, there is a "Retention | + | As explained in above section, there is a "Data Retention" value, in days for every workspace. The oldest day's data in KHIKA for this workspace, which is one day older than the retention period value, has to be moved to another secondary storage called the "Offline Storage". |
For example, if retention period in your LINUX workspace say, is 30 days. Lets say you have data beginning from 1st of March in this workspace. On 31st March, the data of 1st March has crossed its retention period. A snapshot is taken of elastic data index for 1st March and it is copied to the offline storage. | For example, if retention period in your LINUX workspace say, is 30 days. Lets say you have data beginning from 1st of March in this workspace. On 31st March, the data of 1st March has crossed its retention period. A snapshot is taken of elastic data index for 1st March and it is copied to the offline storage. | ||
Line 51: | Line 51: | ||
For larger sizes of data, you may want to [[How do I estimate my per day data?|check daily size of data in KHIKA]] and then estimate how much time you want to retain and what will be the disk size required. It may also depend on any compliance requirements in your organisation. | For larger sizes of data, you may want to [[How do I estimate my per day data?|check daily size of data in KHIKA]] and then estimate how much time you want to retain and what will be the disk size required. It may also depend on any compliance requirements in your organisation. | ||
− | Linux server administrator | + | The Linux server administrator in your environment has to mount a disk partition for offline storage in the KHIKA App server. This is typically larger in size than online disk space and can hold your data upto a year or more. In case of offline disk space too, getting filled up, there are 2 options: |
− | *Increase | + | *Increase offline disk space |
*Move oldest data manually to another long term storage device as and when required. (Not done by KHIKA automatically) | *Move oldest data manually to another long term storage device as and when required. (Not done by KHIKA automatically) | ||
Line 60: | Line 60: | ||
==== Online storage ==== | ==== Online storage ==== | ||
− | *There is a maximum storage of 3GB data per day online storage. | + | *There is a maximum storage of 3GB data per day online storage in KHIKA. |
*This shall include data from multiple devices and stored in multiple workspaces. | *This shall include data from multiple devices and stored in multiple workspaces. | ||
*This shall be retained in KHIKA for 3 days. | *This shall be retained in KHIKA for 3 days. | ||
Line 76: | Line 76: | ||
== Archival Process == | == Archival Process == | ||
− | + | An automatic scheduled Archival process in KHIKA, moves appropriately old data from online to offline storage automatically each day. This follows the "Data Retention" value in each of your workspaces. | |
This process runs only when offline disk is mounted in KHIKA server. | This process runs only when offline disk is mounted in KHIKA server. |
Revision as of 09:01, 12 June 2019
Contents
Overview
The purpose of this section is to provide KHIKA SIEM Users and Administrators, an understanding of the complete life cycle of data stored in KHIKA. In KHIKA, time series log data from data sources is segregated into one or more workspaces such that data from a distinct data source is typically stored on its own, in each workspace’s index. On receiving log data, KHIKA identifies the workspace associated with the data source and stores the data in its corresponding day’s index. In other words, the data received today will be stored in today’s data index while the data received tomorrow will be stored in tomorrow’s data index.
Since log data combined over a period of time tends to becomes large (> few TBs) in size, in-order to maintain optimal KHIKA application performance as well as to ensure prudent use of IT infrastructure and resources, KHIKA data storage is categorized into two types viz.
- Online storage – the data that is readily searchable via KHIKA UI is stored in online storage. The setting/parameter that controls online data retention period is called “TIME-TO-LIVE” or TTL and TTL is a workspace level setting and can be configured as per customer requirements. The default TTL or online data retention period for the workspace is 90 days.
- Offline storage - The older data i.e. data beyond the TTL period is archived or moved from online storage to offline storage.
Checking Data Archival details
Go to Configure from the left pane and select Workspace tab.
KHIKA Data Archival procedure automatically moves data in this workspace, only when it is 91 days old, to the Offline storage. Newer data in the workspace is not moved until 90 days.
Please note : If the online storage disk utilisation reaches 80%, ie. If it is 80% full, then, oldest day data shall be moved to the Offline storage even if it is not 91 days old yet.
To review the Data archival status for a workspace, go to Configure from the main KHIKA menu and select Workspace tab.
Select the required workspace from the dropdown and click on Archival status icon for it. A pop up appears asking for from and to dates for duration of archival report. Select dates and you can get the archival status report as follows:
Offline storage
If you have implemented KHIKA on premise
You have to provide storage, for data to move out from online to offline storage after retention period expires.
As explained in above section, there is a "Data Retention" value, in days for every workspace. The oldest day's data in KHIKA for this workspace, which is one day older than the retention period value, has to be moved to another secondary storage called the "Offline Storage". For example, if retention period in your LINUX workspace say, is 30 days. Lets say you have data beginning from 1st of March in this workspace. On 31st March, the data of 1st March has crossed its retention period. A snapshot is taken of elastic data index for 1st March and it is copied to the offline storage.
Once data moves to Offline storage, it is not searchable on the Discover screen. However it can be recovered to Online storage as needed. If you require some older data, specific day's index can be moved back to online storage for any investigative purposes.
For larger sizes of data, you may want to check daily size of data in KHIKA and then estimate how much time you want to retain and what will be the disk size required. It may also depend on any compliance requirements in your organisation.
The Linux server administrator in your environment has to mount a disk partition for offline storage in the KHIKA App server. This is typically larger in size than online disk space and can hold your data upto a year or more. In case of offline disk space too, getting filled up, there are 2 options:
- Increase offline disk space
- Move oldest data manually to another long term storage device as and when required. (Not done by KHIKA automatically)
If you have implemented KHIKA as SaaS
Online storage
- There is a maximum storage of 3GB data per day online storage in KHIKA.
- This shall include data from multiple devices and stored in multiple workspaces.
- This shall be retained in KHIKA for 3 days.
- Any data shall be discarded on its 4th day.
For additional Online storage and retention please contact our sales team on info@khika.com
Offline storage
For any offline storage related queries, please contact our sales team on info@khika.com
Archival Process
An automatic scheduled Archival process in KHIKA, moves appropriately old data from online to offline storage automatically each day. This follows the "Data Retention" value in each of your workspaces.
This process runs only when offline disk is mounted in KHIKA server.