Azure Mobile Service + Service Fabric = Easy Monitoring

I want to thank all my readers who appreciated my previous blogs on Service Fabric. It is indeed a motivation which drives an tech-enthusiast like me. While driving home, I was thinking of my next blog, when all of a sudden I remembered of a POC which involved Service Fabric Rest API and Azure Mobile Service.

Intent of the POC

The intent of POC was to create a quick efficient way of monitoring Service Fabric cluster without writing tons of code and making things complex. The POC ask was from a client who wanted a feasible solution to monitor his Service Fabric Cluster. The ETA for this was decided to be 7 days, but believe me the first POC was ready by end of second day.

Why Azure Mobile Service?

Azure Mobile Service is an awesome tool for developers who want to write lightweight services, which runs continuously without incurring much effort and cost. I was aware of AMS as I wrote couple of scripts interacting with facebook api and Linkedin in past. (If interested, you can take a look into my netme blog). I thought it will be efficient if the same scripts are used for Service Fabric monitoring. As Service Fabric was exposing REST APIs , it was easy to perform a call from AMS and output the result into Azure Table Storage. A MVC app was created to read the Azure Table storage and show the output in a graphical chart.

Acknowledgement

I want to acknowledge Balaji who mentored me through out this process and made the simple blatant script turn out to be a generic utility.

Let's get our hands dirty

Before starting the problem statement was broken down to major 2 chunks
1. Do Rest API calls to Service Fabric
2. Store the output in table storage

Performing above 2 operations were pretty simple as AMS provide 2 Node.JS packages
a. azure
b. request

If you are new to Node.Js no issues, following script section will help one understand what is the usage for them

Pre-requisite

Ensure that the Http gateway is enabled in Service Fabric Cluster. This will be a section in Cluster Manifest

   <Section Name="HttpGateway">
        <Parameter Name="IsEnabled" Value="true"/>
    </Section>

Also in your ServiceDefinition.csdef under Endpoints, we should have an endpoint for HttpGateway

      <InputEndpoint name="HttpGatewayListenAddress" protocol="http" port="19007" />

In my case the endpoint port is 19007

Do Rest API calls to Service Fabric

Service Fabric provides multiple REST Apis which are documented at https://msdn.microsoft.com/library/azure/dn707692.aspx

For this POC we will discuss few of them

1. In order to start with the POC first we need to create a AMS service



2. Once the mobile service is created one should go to the Scheduler section and create a new job


3. Once the scheduler job is created one can click the AMS scheduler job and go to the Scheduler section, where we will get an area to write our script

4.  Once the job is created one will see an empty function with the job name. Consider this to be the first entry point to the script. As my job name is smurfjob , I got a function name with the same name

function smurfjob() {
      console.log("Starting My App Monitor");
      console.log("This will monitor Node, Application, App Services , Sestem Services, Partition and Replica");
    var dataTypes = [
        "Node"
        , "Application"
        , "AppServices"
        , "SystemServices"
        , "Partition"
        , "Replica"
        ];
    dataTypes.forEach(getData);
}

As mentioned in the above log statements we are going to monitor Service Fabric Node, Application, Application Services, System Services, Partition and Replica

5. Our next module will be saying what is the input and what should be the output. For instance if input is AppServices, it should go to MyAppCurrTable storage table

function getData(dataType, index, array) {
    if (dataType == "Node") {
        getNodesData();
    } else if (dataType == "Application") {
        getApplicationData();
    } else if (dataType == "SystemServices") {
        fetchAndUpdateServicesDataForAppName("System");
    } else if (dataType == "AppServices") {
        getChildDataForParent("MyAppCurrTable", fetchAndUpdateServicesData);
    } else if (dataType == "Partition") {
        getChildDataForParent("MySvcCurrTable", fetchAndUpdatePartitionData);
    } else if (dataType == "Replica") {
        getChildDataForParent("MyPrtnCurrTable", fetchAndUpdateReplicaData);
    }
}

6. Now let's have the modules for different datatypes

Nodes

Let me now take a single module and explain what we want

function getNodesData() {
    //Import the request node.js and azure node.js module
    var req = require('request');
    var azure = require('azure');
 
    // Url for fetching the Node Data 
    var nodeUrlToMonitor = "http://serviceFabricClusterName.cloudapp.net:19007/Nodes?api-version=1.0";
 
    console.log("Url to Monitor" + urlToMonitor);
 
    // fetch Node data
    req.get({
    uri:urlToMonitor,
    timeout : 200000,
    headers:{'content-type': 'application/x-www-form-urlencoded'},
   },
   function (error, response, body) {

   //This is where we fetch the response body
       if (!error && response.statusCode == 200) {
       
           //Parse the json body
           var items = JSON.parse(body);
         
           //Now create the azure storage Table 
           var tableService = azure.createTableService("azure storage account name", "azure storage account key");

           tableService.createTableIfNotExists(TableName, function (error) {
               if (!error) {

                      //Parse the items in json and insert them in Azure Table Storage
                      tableService.insertEntity( parse relevant info)
               }
               else {
                   console.log("CreateTableIfNotExists for " + currentTableName + " returned error: " + error);
               }
            })
       }
       else {
           console.log("ERROR: GET " + urlToMonitor + " returned error: " + error + "and statusCode " + response.statusCode);
       }
   })

}

Hope in the above section we got the actual intent of the idea mentioned earlier. Following is a snapshot of tabular structure of Application service

PartitionKey RowKey ServiceKind Name TypeName ManifestVersion HasPersistedState HealthState ServiceStatus IsServiceGroup AppName
System_ClusterManagerService 5VbGMVNM2EZ7NFYu81gAfO Stateful fabric:/System/ClusterManagerService ClusterManagerServiceType 4.0.99.9490 TRUE Ok 1 FALSE System
System_FailoverManagerService GSc38KxoFFXFThEJbP+vQ0 Stateful fabric:/System/FailoverManagerService FMServiceType 4.0.99.9490 TRUE Ok 1 FALSE System
System_ImageStoreService YcPTmeiqsR0Vh33CpC+7nC Stateful fabric:/System/ImageStoreService FileStoreServiceType 4.0.99.9490 TRUE Ok 1 FALSE System
System_InfrastructureService uEvwax1amAIoal0LvoRL0y Stateful fabric:/System/InfrastructureService InfrastructureServiceType 4.0.99.9490 FALSE Ok 1 FALSE System
System_NamingService x+vT3v9YgNundefinedYvZJPgbTKvo Stateful fabric:/System/NamingService NamingStoreService 4.0.99.9490 TRUE Ok 1 FALSE System
System_RepairManagerService 1bj5zeS2ReeKwbPVSQ1dOE Stateful fabric:/System/RepairManagerService RepairManagerServiceType 4.0.99.9490 TRUE Ok 1 FALSE System

Now one can easily create a MVC app or use PowerBI to read this data , and create a simple dashboard

7. Once all this is done you can save the script and enable it. AMS will act as a background task manager script, and will keep uploading the results as required

Here is the full Azure Mobile Script

function smurfjob() {
      console.log("Starting My App Monitor");
      console.log("This will monitor Node, Application, App Services , Sestem Services, Partition and Replica");
    var dataTypes = [
        "Node"
        , "Application"
        , "AppServices"
        , "SystemServices"
        , "Partition"
        , "Replica"
        ];
    dataTypes.forEach(getData);
}


function getData(dataType, index, array) {
    if (dataType == "Node") {
        getNodesData();
    } else if (dataType == "Application") {
        getApplicationData();
    } else if (dataType == "SystemServices") {
        fetchAndUpdateServicesDataForAppName("System");
    } else if (dataType == "AppServices") {
        getChildDataForParent("MyAppCurrTable", fetchAndUpdateServicesData);
    } else if (dataType == "Partition") {
        getChildDataForParent("MySvcCurrTable", fetchAndUpdatePartitionData);
    } else if (dataType == "Replica") {
        getChildDataForParent("MyPrtnCurrTable", fetchAndUpdateReplicaData);
    }
}

function getNodesData() {
    var req = require('request');
    var azure = require('azure');
    var nodeUrlToMonitor = "http://servicefabricname.cloudapp.net:19007/Nodes?api-version=1.0";
    var nodeHistoryTableName = "NodeHistoryTable";
    var nodeCurrentTableName = "NodeCurrentTable";
    var nodeEventsTableName = "NodeEventsTable";
    var nodePrimaryKeyName = "Name";
    
    fetchAndUpdate("Node", nodeUrlToMonitor, nodePrimaryKeyName, nodeCurrentTableName, nodeEventsTableName, nodeHistoryTableName, 15, null);
}

function getApplicationData() {
    var req = require('request');
    var azure = require('azure');
    var appCurrentTableName = "MyAppCurrTable";
    var appEventsTableName = "MyAppAppEventsTable";
    var appHistoryTableName = "MyAppAppHistoryTable";
    var applicationURL = "http://servicefabricname.cloudapp.net:19007/Applications?api-version=1.0";
    var appPrimaryKeyName = "Id";
    
    fetchAndUpdate("Application", applicationURL, appPrimaryKeyName, appCurrentTableName, appEventsTableName, appHistoryTableName, 15, null);
}

function getChildDataForParent(parentTableName, fetchAndUpdateChildData) {
    var azure = require('azure');
    var tableService = azure.createTableService("storageaccountname", "storageaccountkey");
    var getMatchingEntities = azure.TableQuery
    .select()
    .from(parentTableName);
    tableService.queryEntities(getMatchingEntities, function (error, entities) {
        if (!error) {
            entities.forEach(fetchAndUpdateChildData);
        }
        else {
            console.log("ERROR: queryEntities in " + parentTableName + " returned error " + error);
        }
    });
}

function fetchAndUpdateServicesData(item, index, array) {
    fetchAndUpdateServicesDataForAppName(item.PartitionKey);
}

function fetchAndUpdateServicesDataForAppName(appName) {
    var svcCurrentTableName = "MySvcCurrTable";
    var svcEventsTableName = "MyAppSvcEventsTable";
    var svcHistoryTableName = "MyAppSvcHistoryTable";
    var serviceURL = "http://servicefabricname.cloudapp.net:19007/Applications/" + appName + "/$/GetServices?api-version=1.0";
    var svcPrimaryKeyName = "Id";
    var additionalData = {"AppName": appName};
    console.log("Fetching svc data for app " + appName);
    fetchAndUpdate("Service", serviceURL, svcPrimaryKeyName, svcCurrentTableName, svcEventsTableName, svcHistoryTableName, 15, additionalData);
}

function fetchAndUpdatePartitionData(item, index, array) {
    var svcName = item.PartitionKey.toString();
    var appName = item.AppName;
    fetchAndUpdatePartitionDataForServiceName(appName, svcName);
}

function fetchAndUpdatePartitionDataForServiceName(appName, svcName) {
    var prtnCurrentTableName = "MyPrtnCurrTable";
    var prtnEventsTableName = "MyAppPrtnEventsTable";
    var prtnHistoryTableName = "MyAppPrtnHistoryTable";
    var updatedSvcName = svcName.toString().replace('_', '/');
    console.log("Fetching partition data for app " + appName + " and svc " + updatedSvcName);
    var partitionURL = "http://servicefabricname.cloudapp.net:19007/Applications/" + appName + "/$/GetServices/" + updatedSvcName + "/$/GetPartitions?api-version=1.0";
    var prtnPrimaryKeyName = "PartitionId";
    var additionalData = {"AppName": appName, "ServiceName": updatedSvcName};
    fetchAndUpdate("Partition", partitionURL, prtnPrimaryKeyName, prtnCurrentTableName, prtnEventsTableName, prtnHistoryTableName, 15, additionalData);
}

function fetchAndUpdateReplicaData(item, index, array) {
    var svcName = item.ServiceName.toString().replace("_", "/");
    fetchAndUpdateReplicaDataForPartitionId(item.AppName, svcName, item.PartitionKey, item.ServiceKind);
}

function fetchAndUpdateReplicaDataForPartitionId(appName, svcName, partitionId, serviceKind) {
    var replCurrentTableName = "MyAppReplCurrTable";
    var replEventsTableName = "MyAppReplEventsTable";
    var replHistoryTableName = "MyAppReplHistoryTable";
    var updatedSvcName = svcName.toString().replace('_', '/');
    console.log("Fetching replica data for app " + appName + " and svc " + updatedSvcName + " and prtnId " + partitionId);
    var replicaURL = "http://servicefabricname.cloudapp.net:19007/Applications/" + appName + "/$/GetServices/" + svcName + "/$/GetPartitions/" + partitionId + "/$/GetReplicas?api-version=1.0";
    var replicaPrimaryKeyName = "NodeName";
    /*
    if (serviceKind == "Stateless") {
        replicaPrimaryKeyName = "InstanceId";
    }
    */
    var additionalData = {"AppName": appName, "ServiceName": svcName, "PartitionId": partitionId};
    fetchAndUpdate("Replica", replicaURL, replicaPrimaryKeyName, replCurrentTableName, replEventsTableName, replHistoryTableName, 15, additionalData);
}

function fetchAndUpdate(dataType, urlToMonitor, primaryKeyName, currentTableName, eventsTableName, historyTableName, frequency, additionalData)
{
    var date = new Date(); 
    var current_min = date.getMinutes();
    var fUpdate = current_min % frequency;

    var req = require('request');
    var azure = require('azure');

     console.log("fetchAndUpdate " + urlToMonitor);
    // fetch data
    req.get({
    uri:urlToMonitor,
    timeout : 200000,
    headers:{'content-type': 'application/x-www-form-urlencoded'},
   }, 
   function (error, response, body) {
       if (!error && response.statusCode == 200) {
           var items = JSON.parse(body);
           // console.log(items);
           var tableService = azure.createTableService("storageaccountname", "storageaccountkey");
           tableService.createTableIfNotExists(currentTableName, function (error) {
               if (!error) {
                   // for each row
                   items.forEach(updateTables.bind(null, dataType, primaryKeyName, urlToMonitor, currentTableName, eventsTableName, historyTableName, fUpdate, additionalData));
                   // items.forEach(getHealth.bind(null, dataType, primaryKeyName, urlToMonitor, additionalData));
               }
               else {
                   console.log("CreateTableIfNotExists for " + currentTableName + " returned error: " + error);
               }
            })
       }
       else {
           console.log("ERROR: GET " + urlToMonitor + " returned error: " + error + "and statusCode " + response.statusCode);
       }
   })
}

function getHealth(dataType, primaryKeyName, primaryKeyValueToUse, urlToMonitor, additionalData, item
// , index, array
) {
    // Modify url to add data and GetHealth query
    // console.log("getHealth " + dataType + " item " + item[primaryKeyName]);
    var primaryKeyValue = item[primaryKeyName];
    if (dataType == "Partition") {
        primaryKeyValue = primaryKeyValueToUse;
    }
    
    /*
    var primaryKeyValue = item[primaryKeyName];
    if (dataType == "Partition") {
        var partitionInformation = item["PartitionInformation"];
        primaryKeyValue = partitionInformation["Id"];
    }
    */
    var healthUrl = urlToMonitor.replace("?api-version=1.0", "/" + primaryKeyValue +"/$/GetHealth?api-version=1.0");
    var req = require('request');

    req.get({
    uri:healthUrl,
    timeout : 200000,
    headers:{'content-type': 'application/x-www-form-urlencoded'},
   }, 
   function (error, response, body) {
       if (!error && response.statusCode == 200) {
           var items = JSON.parse(body);
           // console.log(items);
            var entityName = primaryKeyValue;
            if (dataType == "Service") {
                entityName = primaryKeyValue.toString().replace('/', '_');
            }
            else if (dataType == "Replica") {
                entityName = additionalData["PartitionId"] + "_" + primaryKeyValue;
            }
           items["HealthEvents"].forEach(updateHealthEventsTable.bind(null, dataType, entityName));
       }
       else {
           console.log("ERROR: GET health for " + healthUrl + " returned error: " + error + "and statusCode " + response.statusCode);
       }
   })
}

function updateHealthEventsTable(dataType, entityName, item, index, array) {
    var healthEventsTable = "MyAppHealthEventsTable";
    var azure = require('azure');
    var tableService = azure.createTableService("storageaccountname", "storageaccountkey");
    tableService.createTableIfNotExists(healthEventsTable, function (error) {
       if (!error) {
           item["PartitionKey"] = entityName;
           item["RowKey"] = item["SequenceNumber"];
           item["DataType"] = dataType;
           item["HealthState"] = mapEnum("HealthState", item["HealthState"]);
           tableService.insertOrReplaceEntity(healthEventsTable, item, function(error) {
                    if (!error) {
                        console.log(item.PartitionKey + " Health event updated");
                    }
                    else {
                        console.log("ERROR while updating health event for " + item.PartitionKey + " : " + error);
                    }
           }) 
       }
       else {
           console.log("ERROR: CreateTableIfNotExists for " + healthEventsTable + " returned error: " + error);
       }
    })
}

function updateTables(dataType, primaryKeyName, urlToMonitor, currentTableName, eventsTableName, historyTableName, fUpdate, additionalData, item, index, array) {
    var azure = require('azure');
    var primaryKeyValue = item[primaryKeyName];
    if (dataType == "Service") {
        primaryKeyValue = item[primaryKeyName].toString().replace('/', '_');
    }
    else if (dataType == "Partition") {
        var partitionInformation = item["PartitionInformation"];
        primaryKeyValue = partitionInformation["Id"];
        item["PartitionInformation"] = null;
    }
    else if (dataType == "Replica") {
        primaryKeyValue = additionalData["PartitionId"] + "_" + item[primaryKeyName];
    }
    
    for (var prop in additionalData) {
        item[prop] = additionalData[prop];
    }
    
    var getMatchingEntities = azure.TableQuery
    .select()
    .from(currentTableName).where('PartitionKey eq ?', primaryKeyValue);
    var tableService = azure.createTableService("storageaccountname", "storageaccountkey");
    tableService.queryEntities(getMatchingEntities, function (error, entities) {
        if (!error) {
            if (entities.length > 0) {
                for (var prop in item) {
                    // console.log("item[" + prop + "] = " + item[prop] + " ; eProp " + entities[0][prop]);
                    if (prop != "id" && prop != "Id" && prop != "link" && prop != "updated" && prop != "etag" && prop != "PartitionKey" && prop != "RowKey" && prop != "Timestamp") {
                        // if different from currenttable, then update events table
                        if (dataType == "Application" && prop == "Parameters") {
                            item[prop] = JSON.stringify(item[prop]);
                        }
                        item[prop] = mapEnum(prop, item[prop]);
                        var iProp = (item[prop] == null) ? null : item[prop].toString();
                        var eProp = (entities[0][prop] == null) ? null : entities[0][prop].toString();
                        if (iProp != eProp) {
                            writeToEventsTable(eventsTableName, primaryKeyValue, prop, item[prop]);
                            entities[0][prop] = item[prop];
                        }
                    }
                }
                // update row of existing entity
                item["PartitionKey"] = primaryKeyValue;
                item["RowKey"] = entities[0].RowKey;
                tableService.updateEntity(currentTableName, item, function (error) {
                    if (!error) {
                        // console.log(item.PartitionKey + " Entity updated");
                    }
                    else {
                        console.log("ERROR while updating entity " + item.PartitionKey + " : " + error);
                    }
                });
            }
            else {
               item["PartitionKey"] = primaryKeyValue;
               item["RowKey"] = randomString(128);
               for (var prop in item) {
                    item[prop] = mapEnum(prop, item[prop]);
                    if (dataType == "Application" && prop == "Parameters") {
                        var jsonValue = JSON.stringify(item[prop]);
                        console.log("App Parameters = " + jsonValue);
                        item[prop] = jsonValue;
                    }
               }
               tableService.insertEntity(currentTableName, item, function (error) {
                   if (!error) {
                       console.log(item.PartitionKey + " " + dataType + " Entity inserted in current table");
                   }
                   else {
                       console.log("Error while inserting entity " + item.PartitionKey + " in current table : " + error);
                   }
               });
            }
            // Update history table if frequency is met
            if (fUpdate == 0) {
               tableService.createTableIfNotExists(historyTableName, function (error) {
                   if (!error) {
                       item["RowKey"] = randomString(128);
                       tableService.insertEntity(historyTableName, item, function (error) {
                           if (!error) {
                               // console.log(item.PartitionKey + " Entity inserted in history table");
                           }
                           else {
                               console.log("ERROR while inserting entity " + item.PartitionKey + " in history table : " + error);
                           }
                       });
                   }
               });
            }
        }
        else {
            console.log("queryEntities returned error: " + error);
        }
    });
    getHealth(dataType, primaryKeyName, primaryKeyValue, urlToMonitor, additionalData, item);
}

function mapEnum(prop, value) {
    var retVal = value;
    var NodeStatus = {
        "0": "Invalid"
        , "1": "Up"
        , "2": "Down"
        , "3": "Enabling"
        , "4": "Disabling"
        , "5": "Disabled"
    }
    var HealthState = {
        "0": "Invalid"
        , "1": "Ok"
        , "2": "Warning"
        , "3": "Error"
        , "65535": "Unknown"
    }
    var ReplicaRole = {
        "0": "Invalid"
        , "1": "None"
        , "2": "Primary"
        , "3": "IdleSecondary"
        , "4": "ActiveSecondary"
    }
    var ReplicaStatus = {
        "0": "Invalid"
        , "1": "InBuild"
        , "2": "Standby"
        , "3": "Ready"
        , "4": "Down"
        , "5": "Dropped"
    }
    var ServiceKind = {
        "0": "Invalid"
        , "1": "Stateless"
        , "2": "Stateful"
    }
    switch (prop) {
        case "NodeStatus":
            retVal = NodeStatus[value];
            break;
        case "HealthState":
            retVal = HealthState[value];
            break;
        case "ReplicaRole":
            retVal = ReplicaRole[value];
            break;
        case "ReplicaStatus":
            retVal = ReplicaStatus[value];
            break;
        case "ServiceKind":
            retVal = ServiceKind[value];
            break;
        default:
            retVal = value;
            break;
    }
    return retVal;
}

function writeToEventsTable(eventsTableName, partitionKey, propertyName, propertyValue)
{
    var azure = require('azure');
    var tableService = azure.createTableService("storageaccountname", "storageaccountkey");
    tableService.createTableIfNotExists(eventsTableName, function (error) {
        if (!error) {
            var rowKey = randomString(128);
            var task = {
                PartitionKey: partitionKey
                            , RowKey: rowKey
                            , NotifyStateName: propertyName
                            , NotifyStateValue: propertyValue
            }
            tableService.insertEntity(eventsTableName, task, function (error) {
                if (!error) {
                    // console.log('Event table updated for ' + partitionKey + ' property ' + propertyName);
                }
                else {
                    console.log('ERROR while Event Table update: ' + error + ";pk = " + partitionKey + "; rk = " + rowKey + "; pname = " + propertyName + "; pval = " + propertyValue);
                }
            });
        }
    });
}

function randomString(bits){var chars,rand,i,ret
          chars='ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+'
          ret=''
          // in v8, Math.random() yields 32 pseudo-random bits (in spidermonkey it gives 53)
          while(bits > 0){
              rand=Math.floor(Math.random()*0x100000000) // 32-bit integer
            // base 64 means 6 bits per character, so we use the top 30 bits from rand to give 30/6=5 characters.
            for(i=26; i>0 && bits>0; i-=6, bits-=6) ret+=chars[0x3F & rand >>> i]}
              return ret
}



Hope this blog will give readers confidence of creating a service fabric monitor with less cost and effort. Any suggestion or views are always welcome.

Comments

  1. It's very interesting! Thanks for sharing, :)
    But it's not easy to understand. Do you have solution host on github?

    ReplyDelete
  2. Great Idea will be sharing this in a github project soon. Hope after the offline discussion we had this is clear to you :)

    ReplyDelete

Post a Comment

Popular posts from this blog

Firebase authentication with Ionic creator

Big Data - SWOT Analysis

LINKEDIN api call using NODE.JS OAUTH module