Thursday, October 20, 2016

How to setup Splunk Search Head Cluster?

If you already know what is Splunk, and are interested in setting up your own Search Head Cluster, continue reading.

For this, the environment will be:

  • 1 Deployer – sends apps/configurations to the search heads
  • 3 Search Heads – for the SHC
  • 1 Indexer – the “search peer” that the SHC will dispatch jobs to
  • 1 Forwarder – for testing data input from the TA/App into the indexer


Sizing wise you could make them all VMs. Something reasonably small could be as follows for each system – with the Deployer and Forwarder being much smaller.

  • 4 cores
  • 8GB RAM
  • 60GB disk


Once you have all your machines ready, follow steps given below. My steps consider linux-based setups, but you can do it on any other Splunk-supported OS. Make sure to change paths accordingly.

0) If you haven't done already, change the default admin password 'changeme' to something else. Any of the SHC setup commands will not work properly if your admin password is the default one.

1) on Deployer:
in /opt/splunk/etc/system/local/server.conf add following line under [general] stanza, write following line.
pass4SymmKey = yourKey

Replace yourKey with your plaintext key. Do not worry, Splunk will definitely encrypt it later.

2) Initialize all search head clusters:
On each SH, run these commands -
/opt/splunk/bin/splunk init shcluster-config -auth admin:splunk -mgmt_uri <mgmt uri of this setup> -replication_port <any unusual port like 20000> -conf_deploy_fetch_url <mgmt uri of deployer> -secret yourKey
/opt/splunk/bin/splunk restart

Now, at this point, each SH where you ran above commands knows who is deployer for them and the key to authenticate with.

3) Bring up cluster captain:
This step is required only for SH cluster. You can omit this step if you are not setting up SHC.
/opt/splunk/bin/splunk bootstrap shcluster-captain -servers_list "<comma-separated list of mgmt uri of all search heads, including designated captain>" -auth <this setup's username:password>

4) Check search head cluster status:
To check the overall status of your search head cluster, run this command from any of the members:
/opt/splunk/bin/splunk show shcluster-status -auth <this setup's username:password>

5) Deploy the bundle (app):
/opt/splunk/bin/splunk apply shcluster-bundle -target <mgmt uri of SH where you want to deploy app> -auth <SH's username:password>

Your Search Head Cluster setup should be ready and operational now.

What is Splunk?

Its like Google for logs!

When you need to debug some application or system, what do you do? You go through log files. They tell you (almost) everything about what it was trying to do, and what happened. But what do you do when you need to debug a distributed or cloud based or microservices based system? Do you go to each and every machine/app and try to correlate that information with logs of another machine/app? Do you always have a design where all log lines from all those machines or apps are written to single log? Usually not.

That's where Splunk is really useful. Its a log processing and analysis product, which stores all your logs in indexed manner, and provides very fast searching ability.

Events and Indexes

Each entry that gets stored in Splunk is called an 'event'. And, the logical place where a particular event is stored in is called as 'index'. So, when searching, you basically query some index(es) to find some events.
Each indexed event has 4 fields associated with it: time, sourcetype, source and host. The time field indicates when that event happened. Sourcetype identifies data structure of that event, where as source identifies where that event happened. The host is the machine where this event generated. You can search your data using these fields.

Apart from that, Splunk extracts most of the fields from your data, which can also be used while searching the data.

The Splunk setup which indexes the data is called as indexer.

They call it SPL

SPL stands for Search Processing Language. Its like a query you enter to grab some data out of Splunk index(es).

Your search can be a simple term (e.g. a username) to see how frequently it appears in log, or it could be a complex one (e.g. a particular source, particular event, containing this or that, happened between 1am to 4:40am).
There are endless possibilities which you can search using Splunk. We use Splunk in my company to analyze application logs and find out which exceptions occur more often and on which days it reaches the peak point.
There's a lot to talk about this, and the best way to know about SPL is to go through Splunk's own documentation on SPL.

Apps, Add-ons and Data Sources

OK. We have got at least some idea about what it does. But tell me how it gets my log data?!
Well, there's no big magic in that. There apps, add-ons and other data import sources supported in Splunk using which you bring in the data. You can import those logs files - be it syslog, csv or json. You can also develop splunk apps or add-ons to make API calls to outer world, and produce data understandable to Splunk! Splunk is capable of consuming data outputed on stdout/stderr!

There's no database

That brings us to the next point: Where does that data go eventually? Let me tell you that there is no database to manage. Splunk stores data directly in the file system. Because of that, in fact the Splunk setup is quite fast.

Scalability is easy

If a single Splunk server is not enough you can simply add another one. The data can be distributed among multiple Splunk setups. You can have the same forwarder forwarding your data to multiple indexers, or you can have multiple forwarders feeding data to single indexer or a combination of both these cases. Not only that, you can also distribute your search operations using multiple 'search heads'. There are these two interesting deployment scenarios known as Indexer Cluster and Search Head Cluster, and worth exploring when scalability is important.

Wednesday, October 5, 2016

Expand and shrink IPv4 range

Few months ago, I had written this blog post about chunking IPv4 range into multiple sub-ranges. Soon thereafter arouse requirement to have something to expand or shrink given IPv4 range. And, I came up with this code.

There are two functions written - one expands given IPv4 range and other shrinks it. Let me explain each of them one by one.

The expand_range() function

The input IPv4 range could be a comma-separated list of IPv4 addresses or a proper range (e.g. 10.10.10.10-10.10.10.155) or a mix of both. Objective of this function is to provide a list of ALL IPv4 addresses that are part of given range.

This function starts with exploding the input based on comma. For each element in resulting array, it checks whether its a single IPv4 address or a range having a dash (-) character.

If it contains a dash, it further explodes it to get start and end IPv4 addresses, and converts them to long using ip2long(). Then, it simply runs a for loop to generate all the long values in between them and adds them into output array.

If its a single IPv4 address, that is added as it is to output array.

Finally, it sorts the output array, removes duplicates using array_unique(), and converts each element mach to IPv4 address using long2ip().

Based on optional second argument, it either returns array or a string representation of expanded IPv4 range.


The shrink_range() function

Here, the input IPv4 range could be either a string representation or an array, similar to one output by expand_range() function. Objective of this function is to shorten the given IPv4 range. So, if input is 10.10.10.10,10.10.10.11,10.10.10.12,10.10.10.13 then it should shorten it to 10.10.10.10-10.10.10.13.

The function starts with creating an array out of given IPv4 addresses. If first argument itself is array, it just copies it. And it then takes count() of array elements so that it can loop over them.

In the loop, it keeps on checking IPv4 address and current index as well as next index. It converts both of them into log and calculates difference between them by subtracting current index's long value from next index's long value.

If that difference is 1, it means the next IPv4 address is in sequence with current IPv4 address. And, that also means that we are getting into a range which can be shortened. The function then sets a flag to remember that, and starts building string for this short range.

Else, its either a standalone IPv4 address OR we have possibly reached end of short range. So it checks if the flag is still ON. If yes, it ends the short range, and copies the short range string into output array. If we aren't preparing any short range, it simply adds current IPv4 into output array.

Finally, based on optional second argument, it either returns array or string representation of shrunken IPv4 range.