Beyond Robust: Resilient IT (Part 3)

first_imgIf you have followed my series of posts on how to manage IT systems systematically so that IT benefits from failures, you will find that these approaches are similar to the ideas that Dr. Nicolas Taleb Nassim explores in his book, Antifragile: Things That Gain from Disorder. His work in this field applies to IT professionals who want to gain a deeper understanding of how complex and interconnected systems benefit from pressure. I fully acknowledge his ideas and how they relate to the ideas I post here. Using these concepts in IT is part of a journey that began a few years ago in some of the leading IT technology companies. It appears that the idea of anti-fragility (a term Dr. Taleb invented) is something which evolved naturally in IT and in parallel with Dr. Taleb’s work.In Part 1 and Part 2, I introduced concepts of robustness and suggested a new IT management philosophy on how to move IT beyond robustness to a system that benefits from failure. I presented approaches to evolve traditional thinking of scripted top-down testing and change management to create robustness and complement this with the injection of randomness and failure – perhaps best described as introducing managed chaos into the production IT system.However, in the world of cloud there are new dimensions introduced which must be considered in the quest to build systems that benefit from failure. It is possibly even more important to introduce IT management concepts based on reducing fragility in cloud environments due to the significant amount of abstraction through virtualization that clouds typically reside on. The non-transparency of underlying IT resources and the separation of control between the provider and the cloud consumer has the potential to add fragility as underlying complexities can cause unplanned outages during state or configuration changes.Don’t solely rely on an SLA to prevent riskThe premise of public cloud offerings that most appeal to IT leaders are values of scalability, capital efficiency and rapid deployments delivered as a service, often combined with a penalty if the agreed to service level (SLA) is not delivered. Some IT leaders suggest that just having the penalty in place provides enough confidence in the technologies the cloud provider uses. I suggest that IT leaders pay close attention to the architecture so that they are aware of which circumstances IT services will not be delivered and what inter-dependencies there are on vendors across the stack such as infrastructure, applications, networks, end-user devices, etc. The assumption I caution IT leaders to avoid is that the environment is protected from risk just because an SLA is in effect. Ensuring there is transparency to the consumer of cloud services on changes of material interest combined with some form of ability to request information on resiliency is important to ensure that decisions that impact end-users are made with transparency and controls.However, just having a dialogue will not uncover omissions in understanding. IT leaders should consider introducing deliberate failures and stress into the environment to verify that it can withstand stress. This is particularly important in a provider/consumer relationship like the public cloud where the IT stack now has multiple spans of control. Methods to introduce stress differ across these spans of control as well. It is important to realize that even the most integrated cloud offerings are built using many interconnected processes and technologies and failures typically happen in the intersection of technologies, processes and humans.As discussed, public cloud architectures typically span multiple areas of control across multiple providers. Some components in the IT stack are delivered by one or more providers while other components may be provided by the organization using in-house IT staff. It is not uncommon to find that some back-end applications are delivered by a provider as-a-service while in-house IT is in charge of connecting, implementing and delivering end-point devices such as PCs, tablets and smart phones. Intermixed across this stack are multiple levels of security methods, access point controls, networking services, business continuity strategies, recovery and audit systems, business to business communication, compliance and regulatory requirements that may be specific to each company and their industry. This is further complicated by modern cloud architectures that have evolved the notion of applications that are run segmented from each other to gain isolation. These environments use advanced virtualization technologies that emphasize the principle that efficiency and scale are gained by sharing systems between applications while hiding system state within each technology layer, allowing the provider to emphasize flexibility and malleability.This emphasis on flexibility and malleability through advanced sharing is the very reason why random stress testing and fault injection across intersection points is a great idea for the public cloud! Some of the largest consumers of public clouds have implemented technologies which deliberately cause failures at various intersection points and many of them are even delivering their approaches to deliberately cause stress into the environment as open sourced approaches into the broader IT community.The motivation for an IT executive to introduce stress in a public cloud setting with SLA guarantees is that for most businesses, the SLA penalty isn’t as valuable to the business as the service that IT systems deliver to the business. Typically, a penalty serves as a motivator to the provider to reduce fragility and risk. The intersection between what services a cloud provider offers and how the users of a business access, interact and connect to IT is a new frontier.Be aware of all public cloud control points The key insight in public cloud vs. private cloud is based on the separation of span of control and the technologies which has been introduced to abstract underlying resources from the consumer of cloud capabilities. Within these technologies and management constructs in public cloud, thinking must evolve so that fragility is managed higher up in the IT stack than before.In the past, IT executives could rely on the network routing over problems but they may no longer have visibility into network state and as a result, introduce state changes high up near the application itself to assess if the infrastructure reacts as intended. For some applications this has led to application re-design as legacy assumptions on how the infrastructure behaves during failure no longer holds true. For instance, some vendors of public cloud advocate that applications should design in failure management capabilities. Embedding state transfer capabilities into the application itself differs from the private cloud environments where many applications had clustering capabilities provided by the infrastructure itself. The basic idea of the capability remains the same, but the control point changes: in the public cloud environment, the infrastructure offers a set of controls that the application has to integrate against whereas in the private environment, the typical case is that the infrastructure is modeled to suit the application and its requirements. This aspect of public cloud relates not only to risk but also illustrates the challenge of assessing the cost to implement the capability at a defined risk; the intersection points and dependencies that require changes differs significantly across the different types of public cloud providers.IT executives should apply the principle of moving IT beyond robustness and evolving IT into a function that benefits from failure by introducing stress systematically, deliberately and with some randomness to it. This idea applies just as much to the in-house data center as it does to the outsourced IT organization where there may be even more benefit to ensure that IT is resilient. This concept is not just relevant to the public cloud, but may be essential as span of control and visibility combined with invisible state changes and dependencies in the IT infrastructure and their connection points can lead to new and unintended interactions that make robust systems fragile.Modern systems are amazingly agile, scalable, resilient and efficient. However, the legacy thinking of pre-deployment testing combined with change management and scripted top-down failure testing is no longer suitable. The implication for technologists and IT leaders is to evolve their thinking on how to move systems beyond static robustness. A bit of chaos with deliberate failures is a good way to get started. Good luck in your journey!last_img read more

Read More

Mobile World Congress 2014: What a Difference a Year Makes

first_imgI just returned from Barcelona, Spain where the largest service provider event of the year was held: Mobile World Congress 2014. A year ago, not long after joining EMC, Paul Maritz and I attended MWC2013 based on a view that the service provider and off-premise ecosystem would become increasingly important to the EMC federation of businesses (EMC, VMware, Pivotal and RSA). We met with most of the major telecom companies in the world and, universally, they recognized the value of the EMC focus on driving cloud technology and our new focus on building a platform for big data, analytics and modern applications. As I mentioned last year, we left exhausted but excited by the validation of our EMC focus.This year the activity level was an order of magnitude higher. EMC, VMware and Pivotal were all present. In addition to Paul Maritz and me, VMware CEO Pat Gelsinger and EMC Chairman and CEO Joe Tucci also attended.On Wednesday, Joe gave a keynote address that covered the shift to the third platform and how this will result in hundreds of billions of connected devices, the Internet of Things, and millions of new applications. The enabling trends are Cloud, Social, Mobile and Big Data. A key takeaway was that this is inevitable and will disrupt industries.Beyond sharing our view, Joe invited the industry to establish a portable and agile cloud experience by joining the Cloud Foundry community. On Monday, Pivotal announced that Cloud Foundry has moved to a true, open source foundation with the initial members including IBM, HP, Rackspace, and SAP. Many global service providers are already engaged, including NTT, Swisscom, Centurylink, Verizon, and CSC. Joe’s message was clear: the industry needs cloud portability, agility, and an open cloud operating system. Cloud Foundry is the best way to achieve this critical objective.One of the most tangible examples of the traction we are making in the service provider’s future was Joe’s discussion of the Real Time Intelligence (RTI) solution Pivotal has created. We had an early prototype of this offering at MWC2013 and were looking for service provider partners to work with in moving it to reality. We ended up with a long list of operators looking to engage and we began that process. The leading partner for this effort was Vodafone. Fast forward to 2014: we now have the RTI system in Vodafone’s live wireless networks and it has been productized.RTI is a new kind of real time big data platform for a wide range of environments including telecom operators. The RTI system is based on the Pivotal One technologies and provides the ability to access and reason over large, diverse data sets ranging from subscriber databases to billing systems and network information. We then add the Pivotal in-memory real-time analytics to the system to be able to capture and process huge volumes of events coming from the carrier infrastructure. In the Vodafone example, the system ingests over a million events per second collected from the signaling stream of the wireless network. Finally the RTI system is organized as a platform that allows rapid development of new big data applications.In fact, in the Vodafone booth, their Spanish operation wanted to show RTI being used to track and model people and traffic flow over Barcelona in real time. Since their RTI platform was already in place, a new application was developed and deployed in only a few weeks. The system worked by tracing the Vodafone ES employees as they moved throughout the city. You could see in real time how they moved and where they clustered, visualized against 3D maps of the city. In one very interesting view you could see when anonymized employees showed up at the MWC venue and how many were present at any time in the specific halls and sub areas of the site. You could also see where they came from, what parts of the city they passed through, and, given a large enough data set, even visualize population flow in real time to better understand congestion and transportation performance.The key difference between RTI and the existing operator solutions is that RTI creates a common way to collect, manage and reason through your big data and real time data via a modern platform that is optimized for new application development and diverse data sources. We know of hundreds of use cases that RTI can be leveraged to address (from churn management to NPS score visibility and network optimizations)… as a single uniform platform with each use case just means a new application. From concept to product in a little over a year was pretty good progress and exciting to see.Joe closed out by highlighting the EMC progress in building clouds, driving the evolution of storage to accommodate cloud models, and evolving the mobile technology space. He showed off our mobile file sync and share collaboration technology, Syncplicity. Unlike consumer oriented offerings, this system not only provides slick, mobile friendly integration of the user experience but also allows for customer choice of where to store the data (in the cloud, in their data center, hybrid, etc.). In addition to Syncplicity, Joe highlighted VMware’s announced acquisition of Airwatch, which gives VMware the leading technology in Enterprise Mobility management and adds to the already broad end user computing capability within the VMware portfolio. Finally, he told the audience how EMC is helping them manage data at Exabyte scale. It took 26 years to ship a total of 1EB of storage. Last year, EMC shipped that much in a month. Now, we are seeing some individual deals that approach an Exabyte in size.What a difference a year makes! We believe that IT and Telecom are coming together and that EMC technical capability is at the core of this new environment. Mobile World Congress 2014 and the significant visibility of the EMC federation certainly reinforce this.last_img read more

Read More

Inside Look At Why Gartner Tapped EMC as a Leader in the General Purpose Disk Array MQ

first_imgEMC has announced a number of product line enhancements over the past year, beginning with the revolutionary new VMAX3, the industry’s first enterprise data service platform. VMAX3 isn’t just bigger, better and faster than our previous enterprise storage products, it is an innovative new architecture top to bottom, built to match the velocity of the industry. We designed the system to separate the software data services from the underlying hardware to allow us to deliver new software functionality much more quickly and in far simpler ways. What does that mean for our customers? It means we can rapidly evolve VMAX3 capabilities to match evolving needs – for the next decade.The VNX family continues to push the boundaries both inside and outside the storage hardware “box.” Outside the box innovation began earlier this year when we launched vVNX Community Edition, a FREE software-only version of VNX storage built for test and development use cases. vVNX Community Edition has been downloaded more than 10,000 times to date. We have also continued to push the boundaries of VNX into entry-level markets, expanding on our popular VNXe3200 array by introducing the VNXe1600, which starts at under $8,000.Customers in media & entertainment, finance, life sciences, oil & gas and other sectors with critical unstructured data have been turning to Isilon to support their Big Data storage needs for more than a decade. Whether it’s the latest media companies storing massive amounts of video or images to life sciences organizations running genomic sequence workloads, Isilon easily handles users’ large-scale data. With this in mind, we launched the Isilon Scale Out Data Lake earlier this year to bring the data, applications and analytics together. Isilon is a key part of customers’ Data Lake strategies, providing the ability to store and analyze Big Data. And we’re not stopping there. Stay tuned as we prepare to share more exciting Isilon news this coming Tuesday, November 10.With our track record of substantial investment in R&D, you can bet that EMC will continue to innovate, enhance our existing products and develop new ones to continue to meet the needs of our customers, something that we’re confident will continue to be reflected in future reports such as the Gartner Magic Quadrants.View the full report from Gartner, 2015 Magic Quadrant for General-Purpose Disk Arrays*Disclaimer: This Magic Quadrants graphic was published by Gartner, Inc. as part of larger research documents and should be evaluated in the context of the entire documents. The Gartner document is available upon request from EMC. Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose. Written by:Jonathan Siegal, Vice President, Product Marketing, Core Technology Division, EMCSuresh Sathyamurthy, Senior Director,  Product Marketing, Emerging Technology Division, EMCOver the past year, EMC has continued to introduce innovations in our hybrid flash array products, not just to make them go faster or be easier to use, but to push the boundaries of what data storage can do for our customers. Sometimes this means starting over with a new plan, by taking our products – including EMC VMAX, VNX, and Isilon – in new directions to tackle tomorrow’s possibilities. We believe this commitment to innovation, imagination and continuous improvement helped EMC once again earn its position in the top-right of the leaders quadrant in Gartner’s 2015 Magic Quadrant for General-Purpose Disk Arrays shown below*.last_img read more

Read More

10 Things to Know About VxRail and VxRack SDDC

first_imgIT transformation is now more attainable than ever for organizations with VMware environments. With the latest announcements from VxRail and VxRack SDDC, Dell EMC advancements in hyper-converged infrastructure (HCI) are making it possible for customers to accelerate modernization and build an end-to-end VMware cloud.Ready to make the most of this product launch and the growing HCI opportunity? Here are 10 key things to know about the newest releases of VxRail and VxRack SDDC.The #1 HCI portfolio, now even better. Dell EMC VxRail and VxRack SDDC deliver continuous innovation with leading edge technology that further advances the industry’s #1 HCI portfolio.*The only HCI jointly engineered with VMware. VxRail and VxRack SDDC are the only HCI solutions jointly engineered with VMware for seamless integration, automated set-up and upgrades, common management and orchestration, and single-call support.A clear path to an end-to-end VMware cloud. These solutions open a clear and defined path to an end-to-end VMware cloud, and they accelerate a customer’s ability to predictably transform to a multi-cloud environment.Faster, simpler modernization. VxRail and VxRack SDDC simplify modernization, improve operational efficiencies, automate deployment, and accelerate time to production.Embracing the latest advancements. Create certainty with next-generation technology that future-proof customers’ IT infrastructure through NVMe cache drives and 25Gbps network connectivity on VxRail.Harnessing the power of the 14th generation. VxRack SDDC is now built on VxRail based on 14th generation PowerEdge servers, optimized for HCI.Flexible scalability. Customers can start small with as few as 3 nodes with VxRail and expand in single nodes with flexibility to mix and match node types, including across generations.The Dell Technologies story. This evolving HCI leadership is part of a bigger story that puts Dell EMC at the epicenter of the world’s largest technology company—one that delivers a truly unified yet simple experience for customers solving complex problems.Foundations for a cloud-first world. HCI is a key foundational component of Dell EMC transforming IT for a cloud-first world. Customers are embracing a cloud-first operating model to enable them to quickly implement new ideas, lower complexity and risk, and create transparent and efficient systems.On May 1, it begins. These product advancements for VxRail and VxRack SDDC were announced to customers at Dell Technologies World on May 1, 2018. Get ready to seize the momentum coming out of the event.Start building credibility and demand with your customers with a complete library of assets for Dell EMC VxRail and VxRack SDDC *Source: IDC Worldwide Quarterly Converged Systems Tracker, 3Q17 Vendor Revenue, December 2017last_img read more

Read More

Google says North Korea-backed hackers sought cyber research

first_imgSEOUL, South Korea (AP) — Google believes hackers backed by the North Korean government posed as computer security bloggers on social media while attempting to steal information from researchers in the field. Experts say the attacks targeting computer security experts reflect North Korean efforts to improve its cyber skills and be able to breach widely used computer products, such as Google’s Chrome browser and Microsoft’s Windows 10 operating system. A Google researcher says the hackers created social media profiles to build credibility and interact with researchers, who were the compromised after following a link to a blog set up by the hackers. North Korea has been linked to the 2014 hacking of Sony Pictures and the WannaCry malware attack of 2017.last_img

Read More