Thursday, October 20, 2016

What is Splunk?

Its like Google for logs!

When you need to debug some application or system, what do you do? You go through log files. They tell you (almost) everything about what it was trying to do, and what happened. But what do you do when you need to debug a distributed or cloud based or microservices based system? Do you go to each and every machine/app and try to correlate that information with logs of another machine/app? Do you always have a design where all log lines from all those machines or apps are written to single log? Usually not.

That's where Splunk is really useful. Its a log processing and analysis product, which stores all your logs in indexed manner, and provides very fast searching ability.

Events and Indexes

Each entry that gets stored in Splunk is called an 'event'. And, the logical place where a particular event is stored in is called as 'index'. So, when searching, you basically query some index(es) to find some events.
Each indexed event has 4 fields associated with it: time, sourcetype, source and host. The time field indicates when that event happened. Sourcetype identifies data structure of that event, where as source identifies where that event happened. The host is the machine where this event generated. You can search your data using these fields.

Apart from that, Splunk extracts most of the fields from your data, which can also be used while searching the data.

The Splunk setup which indexes the data is called as indexer.

They call it SPL

SPL stands for Search Processing Language. Its like a query you enter to grab some data out of Splunk index(es).

Your search can be a simple term (e.g. a username) to see how frequently it appears in log, or it could be a complex one (e.g. a particular source, particular event, containing this or that, happened between 1am to 4:40am).
There are endless possibilities which you can search using Splunk. We use Splunk in my company to analyze application logs and find out which exceptions occur more often and on which days it reaches the peak point.
There's a lot to talk about this, and the best way to know about SPL is to go through Splunk's own documentation on SPL.

Apps, Add-ons and Data Sources

OK. We have got at least some idea about what it does. But tell me how it gets my log data?!
Well, there's no big magic in that. There apps, add-ons and other data import sources supported in Splunk using which you bring in the data. You can import those logs files - be it syslog, csv or json. You can also develop splunk apps or add-ons to make API calls to outer world, and produce data understandable to Splunk! Splunk is capable of consuming data outputed on stdout/stderr!

There's no database

That brings us to the next point: Where does that data go eventually? Let me tell you that there is no database to manage. Splunk stores data directly in the file system. Because of that, in fact the Splunk setup is quite fast.

Scalability is easy

If a single Splunk server is not enough you can simply add another one. The data can be distributed among multiple Splunk setups. You can have the same forwarder forwarding your data to multiple indexers, or you can have multiple forwarders feeding data to single indexer or a combination of both these cases. Not only that, you can also distribute your search operations using multiple 'search heads'. There are these two interesting deployment scenarios known as Indexer Cluster and Search Head Cluster, and worth exploring when scalability is important.

No comments:

Post a Comment