Whispers & Screams
And Other Things

The Web By Proxy

I've been working on networks for decades and for as long as I can remember, network proxies have existed. I first came across the idea when I worked for IBM as an SNA programmer back in the late 90s but it's in more recent years that network proxies have taken on more importance. 

Continue reading
2467 Hits
0 Comments

Know the way, Go the way, Show the way



Over my many years in business, whether the business of the military or the business of commerce, one of the core threads of weakness in almost all but the best managers/leaders I have worked with has been an inability or perhaps an unwillingness to communicate. All too often I have witnessed poor management communication not only down through the command structure but also, quite frequently, within what would be considered the first tier of communications. Their direct reporters.

Many such businesses have, it seemed, succeeded or perhaps survived, in spite of rather than because of these individuals for whom communication should be the centrepiece of their toolbox. Usually in these situations, the intentions are top drawer but the reality is bargain basement. Individuals in such positions of authority resting on their past achievements or being reasonably content with the status quo and pulling up the drawbridge to their rarefied level perhaps feel like they should maintain an authoritative distance or refrain from fraternising with the ranks. Ridiculous as such a stance may sound on paper, it is all too often manifest in management positions in all levels of business with the reality for the organisation far more serious than any ridicule may reflect.

Directionless authority figures who fail to capitalise on the talent within their organisations because of their inability to communicate beyond their own lieutenants can lay waste to layer upon layer of that which makes an organisation truly prosper, its people. This is especially true in the world of the startup where those in authority and indeed in control have the greatest of vested interests in seeing the business boom.

As managers, and most especially as managers within small businesses for whom hierarchical structures are not best fit, communication is what ensures that our own value systems are properly superimposed on the wider team around us. We need to accept our weaknesses. Work on them. Learn by placing ourselves in the uncomfortable situations we could easily avoid and the best way to measure this and truly understand it is to get down and dirty every day. Do sweat the small stuff. Truly understand the small stuff because when we get the small stuff right and we can communicate down and listen up effectively, communicating all the way down and listening all the way up, we will find ourselves at the centre of a team that really will begin to reflect the hopes and dreams we all have for our own organisations.

Continue reading
1900 Hits
0 Comments

Cisco Open SOC

So a couple of days ago Cisco, it would seem, have finally released their new open source security analytics framework: OpenSOC to the developer community. OpenSOC sits conceptually at the intersection between Big Data and Security Analytics

OpensocThe current totalizer on the Breach Level Index website (breachlevelindex.com) sits at almost 2.4 billion data records lost this year so far which works out approximately 6 million per day. The levels of this data loss will not be dropping anytime soon as attackers are only going to get better at getting their hands on this information. There is hope however as even the best hackers leave clues in their wake although finding these clues in enormous amounts of analytical data such as logs and telemetry can be the biggest of challenges.

This is where OpenSOC will seek to make the crucial difference and bridge the gap. Incorporating a platform of anomaly detection and incident forensics, it integrates elements of the Hadoop environment such as Kafka, Elasticsearch and Storm to deliver a scalable platform enabling full-packet capture indexing, storage, data enrichment, stream processing, batch processing, real-time search and telemetry aggregation. It will seek to provide security professionals the facility to detect and react to complex threats on a single converged platform.

The OpenSOC framework provides three key elements for security analytics:


    1. Context


      An extremely high speed mechanism to capture and store security data. OpenSOC consumes data by delivering it to multiple high speed processors capable of heavy lift contextual analytics in tandem with appropriate storage enabling subsequent forensic investigations.

 


    1. Real-time Processing


      Application of enrichments such as threat intelligence, geolocation, and DNS information to collected telemetry providing for quick reaction investigations.

 


    1. Centralized Perspective


      The interface presents alert summaries with threat intelligence and enrichment data specific to an alert on a single page. The advanced search capabilities and full packet-extraction tools are available for investigation without the need to pivot between multiple tools.



When sensitive data is compromised, the company’s reputation, resources, and intellectual property is put at risk. Quickly identifying and resolving the issue is critical, but, traditional approaches to security incident investigation can be time-consuming. An analyst may need to take the following steps:

    1. Review reports from a Security Incident and Event Manager (SIEM) and run batch queries on other telemetry sources for additional context.

 

    1. Research external threat intelligence sources to uncover proactive warnings to potential attacks.

 

    1. Research a network forensics tool with full packet capture and historical records in order to determine context.



Apart from having to access several tools and information sets, the act of searching and analyzing the amount of data collected can take minutes to hours using traditional techniques. Security professionals can use a single tool to navigate data with narrowed focus instead of wasting precious time trying to make sense of mountains of unstructured data.

Continue reading
1608 Hits
0 Comments

Too Much Information - Hadoop and Big Data

hHadoop, a free, Java-based programming framework that makes it possible to run applications on systems with thousands of nodes involving thousands of terabytes, supports the processing of large amounts of data in a distributed computing environment and is part of the Apache project sponsored by the Apache Software Foundation. Its distributed file system facilitates rapid data transfer rates among nodes and allows the system to continue operating uninterrupted in case of a node failure. This approach lowers the risk of catastrophic system failure, even if a significant number of nodes become inoperative.

Hadoop was inspired by Google's MapReduce, a software framework in which an application is broken down into numerous small parts. Any of these parts (also called fragments or blocks) can be run on any node in the cluster. Doug Cutting, Hadoop's creator, named the framework after his child's stuffed toy elephant. The current Apache Hadoop ecosystem consists of the Hadoop kernel, MapReduce, the Hadoop distributed file system (HDFS) and a number of related projects such as Apache Hive, HBase and Zookeeper.

The Hadoop framework is used by major players including Google, Yahoo and IBM, largely for applications involving search engines and advertising. The preferred operating systems are Windows and Linux but Hadoop can also work with BSD and OS X.

The rapid proliferation of unstructured data is one of the driving forces of the new paradigm of big data analytics. According to one study, we are now producing as much data every 10 minutes as was created from the beginning of recorded time through the year 2003.1 The preponderance of data being created is of the unstructured variety -- up to about 90%, according to the IDC.

Big data is about being able to not just capture a wide variety of unstructured data, but to also capturing that data and combining it with other data to gain new insights that can be used in many ways to improve business performance. For Instance, in retail, it could mean delivering faster and better services to customers; in research, it could mean conducting tests over much wider sampling sizes; in healthcare, it could mean faster and more accurate diagnoses of illnesses.

The ways in which big data will change our lives is significant, and just beginning to reveal itself for those who are willing to capture, combine, and discover answers to their Big Questions. For big data to deliver on the promise of its vast potential, however, technology must be in place to enable organizations to capture and store massive amounts of unstructured data in its native format. That’s where Hadoop has become one of the enabling data processing technologies for big data analytics. Hadoop allows for dramatically bigger business questions to be answered, that we are already starting to see realized from large public cloud companies, which will shortly infiltrate into other IT oriented industries and services.

More than 50% of participating companies have begun implementing the available Hadoop frameworks as data hubs or auxiliary data repositories to their existing infrastructures, according to Intel’s 2013 IT Manager’s Survey on How Organizations are Using Big Data. In addition, 31% more organizations reported evaluating one of open-source Apache Hadoop framework.

So what are the key characteristics IT professionals should know about Hadoop in order to maximize its potential in managing unstructured data and advancing the cause of big data analytics? Here are five to keep in mind:

    1. Hadoop is economical. As an open-source software framework, Hadoop runs on standard servers. Hardware can be added or swapped in or out of a cluster, and operational costs are relatively low because the software is common across the infrastructure, requiring little tuning for each physical server.

 

    1. Hadoop provides an efficient framework for processing large sets of data. MapReduce is the software programming framework in the Hadoop stack. Simply put, rather than moving data across a network to be processed, MapReduce provides a framework to move the processing software to the data.3 In addition to simplifying the processing of big data sets, MapReduce also provides programmers with a common method of defining and orchestrating complex processing tasks across clusters of computers.

 

    1. Hadoop supports your existing database and analytics infrastructures, and does not displace it. Hadoop can handle data sets and tasks that can be a problem for legacy databases. In big data environments, you want to make sure that the underlying storage and infrastructure platform for the database is capable of handling the capacity and speed of big data initiatives, particularly for mission-critical applications. Because of this capacity it can and has been implemented as a replacement to existing infrastructures, but only where it fits the business need or advantage

 

    1. Hadoop will provide the best value where it is implemented with the right infrastructure. The Hadoop framework typically runs on mainstream standard servers using common Intel® server hardware. Newer servers with the latest Intel® computing, larger memory footprint, and more cache will typically provide better performance. In addition, Hadoop will perform better with faster in node storage, so systems should contain some amount of solid-state storage. In addition, the storage infrastructure should be optimized with the latest advances in automated tiering, deduplication, compression, encryption, erasure coding and thin provisioning. When Hadoop has scaled to encompass larger datasets it benefits from faster networks, so then 10Gb Ethernet rather than typical 1GbE bandwidth provides further benefit.

 

    1. Hadoop is supported by a large and active ecosystem. Big data is a big opportunity, not just for those using it to deliver competitive advantage, but also to those providing solutions. A large and active ecosystem has developed quickly around Hadoop, as it usually does around open-source solutions. As an example, Intel recently invested $740 million dollars into the leading distribution for Hadoop provided by Cloudera. Vendors are available to provide all or part of the Hadoop stack, including management software, third-party applications and a wide range of other tools to help simplify the deployment of Hadoop.



Unstructured data is growing nonstop across a variety of applications, in a wide range of formats. Those companies that are best able to harness it and use it for competitive advantage are seeing significant results and benefits. That’s why more than 80% of the companies surveyed by Intel are using, implementing or evaluating Hadoop.

Continue reading
1551 Hits
0 Comments

Could ants power Web3.0 to new heights? OSPF v's ANTS

Having recently completed my latest M.Eng block on the subject of "Natural and Artificial Intelligence", I became aware of advances made in the recent decade towards a new paradigm of network traffic engineering that was being researched. This new model turns its back on traditional destination based solutions, (OSPF, EIGRP, MPLS) to the combinatorial problem of decision making in network routing  favouring instead a constructive greedy heuristic which uses stochastic combinatorial optimisation. Put in more accessible terms, it leverages the emergent ability of sytems comprised of quite basic autonomous elements working together, to perform a variety of complicated tasks with great reliability and consistency.

In 1986, the computer scientist Craig Reynolds set out to investigate this phenomenon through computer simulation. The mystery and beauty of a flock or swarm is perhaps best described in the opening words of his classic 1986 paper on the subject:

The motion of a flock of birds is one of nature’s delights. Flocks and related synchronized group behaviors such as schools of fish or herds of land animals are both beautiful to watch and intriguing to contemplate. A flock ... is made up of discrete birds yet overall motion seems fluid; it is simple in concept yet is so visually complex, it seems randomly arrayed and yet is magnificently synchronized. Perhaps most puzzling is the strong impression of intentional, centralized control. Yet all evidence dicates that flock motion must be merely the aggregate result of the actions of individual animals, each acting solely on the basis of its own local perception of the world.

An analogy with the way ant colonies function has suggested that the emergent behaviour of ant colonies to reliably and consistently optimise paths could be leveraged to enhance the way that the combinatorial optimisation problem of complex network path selection is solved.

The fundamental difference between the modelling of a complex telecommunications network and more commonplace problems of combinatorial optimisation such as the travelling salesman problem is that of the dynamic nature of the state at any given moment of a network such as the internet. For example, in the TSP the towns, the routes between them and the associated distances don’t change. However, network routing is a dynamic problem. It is dynamic in space, because the shape of the network – its topology – may change: switches and nodes may break down and new ones may come on line. But the problem is also dynamic in time, and quite unpredictably so. The amount of network traffic will vary constantly: some switches may become overloaded, there may be local bursts of activity that make parts of the network very slow, and so on. So network routing is a very difficult problem of dynamic optimisation. Finding fast, efficent and intelligent routing algorithms is a major headache for telcommunications engineers.

So how you may ask, could ants help here? Individual ants are behaviourally very unsophisticated insects. They have a very limited memory and exhibit individual behaviour that appears to have a large random component. Acting as a collective however, ants manage to perform a variety of complicated tasks with great reliability and consistency, for example, finding the shortest routes from their nest to a food source.

These behaviours emerge from the interactions between large numbers of individual ants and their environment. In many cases, the principle of stigmergy is used. Stigmergy is a form of indirect communication through the environment. Like other insects, ants typically produce specific actions in response to specific local environmental stimuli, rather than as part of the execution of some central plan. If an ant's action changes the local environment in a way that affects one of these specific stimuli, this will influence the subsequent actions of ants at that location. The environmental change may take either of two distinct forms. In the first, the physical characteristics may be changed as a result of carrying out some task-related action, such as digging a hole, or adding a ball of mud to a growing structure. The subsequent perception of the changed environment may cause the next ant to enlarge the hole, or deposit its ball of mud on top of the previous ball. In this type of stigmergy, the cumulative effects of these local task-related changes can guide the growth of a complex structure. This type of influence has been called sematectonic. In the second form, the environment is changed by depositing something which makes no direct contribution to the task, but is used solely to influence subsequent behaviour which is task related. This sign-based stigmergy has been highly developed by ants and other exclusively social insects, which use a variety of highly specific volatile hormones, or pheromones, to provide a sophisticated signalling system. It is primarily this second mechanism of sign based sigmergy that has been successfully simulated with computer models and applied as a model to a system of network traffic engineering.

In the traditional network model, packets move around the network completely deterministically. A packet arriving at a given node is routed by the device which simply consults the routing table and takes the optimum path based on its destination. There is no element of probability as the values in the routing table represent not probabilities, but the relative desirability of moving to other nodes.

In the ant colony optimisation model, virtual ants also move around the network, their task being to constantly adjust the routing tables according to the latest information about network conditions. For an ant, the values in the table are probabilities that their next move will be to a certain node.The progress of an ant around the network is governed by the following informal rules:

    • Ants start at random nodes.

 

    • They move around the network from node to node, using the routing table at each node as a guide to which link to cross next.

 

    • As it explores, an ant ages, the age of each individual being related to the length of time elapsed since it set out from its source. However, an ant that finds itself at a congested node is delayed, and thus made to age faster than ants moving through less choked areas.

 

    • As an ant crosses a link between two nodes, it deposits pheromone however, it leaves it not on the link itself, but on the entry for that link in the routing table of the node it left. Other 'pheromone' values in that column of the nodes routing table are decreased, in a process analogous to pheromone decay.

 

    • When an ant reaches its final destination it is presumed to have died and is deleted from the system.R.I.P.



Testing the ant colony optimisation system, and measuring its performance against that of a number of other well-known routing techniques produced good results and the system outperformed all of the established mechanisms however there are potential problems of the kind that constantly plague all dynamic optimisation algorithms. The most significant problem is that, after a long period of stability and equilibrium, the ants will have become locked into their accustomed routes. They become unable to break out of these patterns to explore new routes capable of meeting new conditions which could exist if a sudden change to the networks conditions were to take place. This can be mitigated however in the same way that evolutionary computation introduces mutation to fully explore new possibilities by means of the introduction of an element of purely random behaviour to the ant.

'Ant net' routing has been tested on models of US and Japanese communications networks, using a variety of different possible traffic patterns. The algorithm worked at least as well as, and in some cases much better than, four of the best-performing conventional routing algorithms. Its results were even comparable to those of an idealised ‘daemon’ algorithm, with instantaneous and complete knowledge of the current state of the network.

It would seem we have not heard the last of these routing antics.... (sorry, couldnt resist).

Continue reading
2147 Hits
0 Comments