Thursday, June 30, 2016

Big Data Analytics with the WSO2 Analytics Platform

Organizations have more data than ever at their disposal. Actually deriving meaningful insights from that data—and converting knowledge into action—is easier said than done because there’s no single technology that encompasses big data analytics. There are several types of technology that work together to help organizations get the most value from their information. 

Big data analytics is the process of examining large data sets containing a variety of data types - i.e., big data - to uncover hidden patterns, unknown correlations, market trends, customer preferences and other useful business information. The analytical findings can lead to more effective marketing, new revenue opportunities, better customer service, improved operational efficiency, competitive advantages over rival organizations and other business benefits. Data could include Web server logs and Internet click stream data, social media content and social network activity reports, text from customer emails and survey responses, mobile-phone call detail records and machine data captured by sensors connected to the Internet of Things. 

What does WSO2 offer?

The WSO2 Analytics Platform, of course. It’s a single platform to address all analytics styles – 
  • Batch Analytics: Analysis of data at rest, running typically every hour or every day, and focused on historical dashboards and reports.
  • Real time Analytics: Analysis of event streams in real-time and detecting patterns and conditions.
  • Predictive Analytics: Using machine learning to create a mathematical model which allows predicting future behavior.
  • Interactive Analytics: Executing queries on the fly on top of data at rest. 
In a nutshell, you must collect data and feed them to the analytics platform. Then, the analytics platform will analyze the collected data using one or more of the above analytics techniques, and finally communicate the results as alerts, dashboards, notifications etc. 

Some Terminology

The WSO2 Analytics Platform processes data as events and also interacts with external systems using events. Let’s get the terminology right.  

Event =  a unit of data comprising a set of attributes
Event Stream = a sequence of events of a particular type. 
Event Stream Definition = Type or schema of events 
Event Receiver = Events are received through various transport protocols using event receivers
Event Publisher  = Events that are resulted after data analysis or even direct events are published via various transport protocols through event publishers. 

Data Collection

The WSO2 Analytics platform offers a single API to collect data and it can receive data through event receivers from almost any event source through inbuilt agents in all WSO2 products, Java agents (Thrift, Kafka, JMS), JavaScript clients (Web Sockets, REST), IoT(MQTT) and from over 100 WSO2 ESB connectors. You can even write a custom agent to collect data from your system and push it to the analytics platform. Basically, events can be received via multiple transports in JSON, XML, Map, Text, and WSO2Event formats, and the platform converts them into streams of canonical WSO2Events to be processed by the server. 

Data Analysis

Once the data is collected, the data must be analyzed through one or more of the following techniques: 

Batch Analytics 
Real-time Analytics
Interactive Analytics
Predictive Analytics

The WSO2 Analytics Platform comprises 3 individual products:

WSO2 Data Analytics Server – can perform batch, real-time and interactive analytics
WSO2 Complex Event Processor – used only for real-time analytics
WSO2 Machine Learner – used only for predictive analytics

I will not cover the technical details on how each of these techniques acts differently on the collected data in this blog post. Please check links [1] and [2] for more details on the four techniques that the WSO2 Analytics Platform uses to analyze data. 

Data Publishing

Events that have resulted after data analysis (or even data that is not analyzed) are published through various transport protocols – including but not limited to SMS, Email, HTTP, JMS, Kafka, MQTT, RDBMS, Logger, WebSockets - through event publishers. You can write extensions to support other transports as well. Data can be pushed to dashboards through WebSockets/REST or services/APIs can be invoked. The data can be pushed to the ESB which will in turn push them to legacy systems or even cloud applications. Moreover, the data can be stored in another database or be published to other systems where the systems have implemented Custom WSO2 Data Receivers. 

This blog post was merely an overview of what the WSO2 Analytics Platform offers and how it operates. The post is based on the content of a webinar [2] I did a few weeks back.  Please check out the full webinar for detailed information about the WSO2 Analytics Platform and the various applications of the different analytics styles. 

Thursday, March 10, 2016

WSO2 API Manager - Basic Functionality Flow Diagram

For more information check out the WSO2 API Manager here

WSO2 Governance Registry 5.x.x FAQs

The WSO2 Governance Registry (G-Reg) has gone through some major transformations, starting from G-Reg 5.0.0 (the current version as of this writing is 5.1.0). In addition to the new and enhanced registry and repository features, G-Reg now comes with multiple views for different roles, i.e. publishers, consumers/subscribers and administrators. This is a significant change from the previous versions which just included one view for all the users. Before understanding the nuts and bolts of G-Reg let's first understand what a registry is and its purpose. What does it do? Why use one?

If your business is SOA-enabled, you need to keep track of your services and who consumes them. Furthermore, as businesses undergo change, including mergers and acquisitions, the number of platforms, consumers, services, and exposed APIs can increase rapidly. SOA Governance is needed to provide full visibility into existing assets; without it, businesses lack the tools to govern and manage assets consistently.It's all about ensuring and validating that assets and artifacts within the architecture are acting as expected and maintaining a certain level of quality. This is where the registry comes into the picture to facilitate SOA governance. The registry can act as a central database that includes artifacts for all services planned for development, in use and retired. Essentially, it's a catalog of services which are searchable by service consumers and providers. The WSO2 Governance Registry is more than just a SOA registry, because in addition to providing end-to-end SOA governance, it can also store and manage any kind of enterprise asset including but not limited to: services, APIs, policies, projects, applications, people. 

Now that we understand what a registry is, here is a compilation of some FAQs and answers related to WSO2 G-Reg to understand what it offers and how it behaves. 

What is the WSO2 Governance Registry ?

The WSO2 Governance Registry (G-Reg) is a SOA-integrated registry-repository for storing and managing data or metadata related to service artifacts and other artifacts. It provides a rich set of features including SOA governance, lifecycle management, and a strong framework for governing anything. For more information on the features and functionality of WSO2 Governance Registry, go to WSO2 Governance Registry.

WSO2 Governance Registry’s main functionality falls under the following two categories.
Content repository
Governance framework

WSO2 Governance Registry provides three main web based user interfaces to facilitate the features and functionality as follows. 

G-Reg Publisher - an end-user, collaborative web interface  for governance artifacts providers to publish artifacts, manage them, show their dependencies, and gather feedback on quality and usage of them. 

G-Reg Publisher

G-Reg Store - an end-user, collaborative Web interface for consumers to self-register, discover governance artifact functionality, subscribe to artifacts, evaluate them and interact with artifact publishers.

G-Reg Store

G-Reg Management Console - a Web interface for administrators to perform admin tasks. 

Management Console

How does a service consumer use the solution to find a service and implement a service client?

The service consumers can use the G-Reg Store to self-register, discover and search for SOAP/REST services. G-Reg offers configuration options such as tags, categories, comments, properties, ratings and descriptions for a resource. It is important to plan the use of these configurations, to facilitate discovering services and enabling correct SOA Governance. Resources for service discovering tremendously help in service reuse. In fact, it's one of the major functions of a registry-repository product. G-Reg provides enhanced search capabilities to facilitate search based on tags and other advanced criteria.

How does a service provider register a new service?

G-Reg allows service providers to register services through the G-Reg Publisher. Users can choose either to enter service details manually  or to import service information using a WSDL/WADL url. The G-Reg Publisher facilitates artifact providers/creators to publish artifacts, manage them, show their dependencies, and gather feedback on quality and usage of them.

How does a service provider use the solution when making a change to a service specification or endpoint ?

A service provider can use the G-Reg publisher to perform changes such as editing or versioning an existing service.  G-Reg provides tools for asset comparison, dependency management and visualizing service descriptions. It also supports WS-Eventing-based subscriptions and notifications that can be used to govern changes made to individual resources as well as to the lifecycle to which it belongs.

How does the solution facilitate service governance?

Service reuse is the heart of SOA. Before implementing a new service, a service provider can search in the registry for existing implementations. This helps the provider to use an existing service either as it is or by developing a new service associating the existing service. Furthermore, registry-repositories help discover associations among services. This helps to get a better idea of any impacts when changing a particular service. And services in a registry undergoes lifecycle states of create, test, deploy and deprecate.

G-Reg can perform the following functions:

  • Enforce policies during transitions of the states of create, test, deploy and deprecate.
  • Define "who can access what?" of services. Access to certain services may differ depending on the user, user group or state of the service lifecycle.
  • Send notifications to relevant users once a change to a service artifact has been made.

As more and more services are introduced and reused, it is necessary to keep track of dependencies of each service in an organization. G-Reg makes life easier by keeping inter-service dependency information as relationships among service information artifacts. For example, such relationships can be Contains, Implements, Uses, Depends, etc.

Service artifacts evolve over time due to reasons such as fulfilling new requirements and yielding to different versions of the same service. G-Reg provides versioning capabilities that can enable automatic version control of artifacts stored. Additionally. G-Reg keeps older versions of artifacts to allow users to migrate smoothly from one version to another.

In summary, G-Reg provides the following capabilities to facilitate SOA Governance:

  • Record information on services
  • Add service/API information manually or import WSDLs/WADLs
  • Discover services using scheduled tasks and discovery agents
  • Search for an existing service for reuse
  • Search using tags/categories. Supported via SOLR.
  • Discover associations and dependencies of a service
  • Service lifecycle management
  • Lifecycle-based asset management
  • In-built and custom lifecycle executors
  • User access control
  • Automatic version control
  • Notification support (email, UI etc.)
  • An SDK for registry-repository extensibility

How is the solution used at design time vs. runtime?

G-Reg can be used during design-time to record service information and govern the service lifecycle .
If needed, using lifecycle executors, services can be deployed/undeployed in relevant servers based on the lifecycle state transition. For example, a Jenkins job can be triggered during a state transition by a custom lifecycle executor - the executor will invoke a remote API of Jenkins which will, for example, build and and deploy service(s) in production environment when promoted from testing to production.

Run-time policy enforcement can be done when associating a WS-Policy with a SOAP service. G-Reg can apply these policies using Handlers (Handlers provide the basis for extending the WSO2 Governance Registry functionality). This is an extension feature as G-Reg only creates an association out-of-the-box.

How does the solution manage multiple endpoints for a service (e.g. Dev, QA, Production)?

Endpoints can be added manually to a service via G-Reg Publisher.  When importing WSDLs, only one endpoint will be added; however, more endpoints (QA, Prod) can be added manually.

How does the solution manage multiple versions of a service?

G-Reg provides support to version existing services, view all versions of a service and restore to a previous version. It is possible to compare different versions of governance artifacts via the Publisher, provided that the comparison is between versions of the same artifact type. If required, resources can be automatically versioned when they are added or updated. But this feature is disabled by default.

How does the solution manage service deprecation and discontinuation?

Currently, G-Reg only changes the state of the service to Deprecated in the default service lifecycle and can be configured to notify subscribers of that service  However, if any actions need to take place as a result of the lifecycle state being changed to Deprecated, a lifecycle executor can be configured to implement such tasks.