High Scalability

Stuff The Internet Says On Scalability For September 30th, 2016

High Scalability - Fri, 2016-09-30 15:56

Hey, it's HighScalability time:

 

Everything is a network. Map showing the global genetic interaction network of a cell. 

 

If you like this sort of Stuff then please support me on Patreon.
  • 18: Google can now drink and drive in Washington DC.; $10 billion: cost of a Vision Quest to Mars; 620 Gbps: DDoS attack on KrebsOnSecurity; 1 Tbps: DDoS attack on OVH; $200,000: cost of a typical cyber incident; 8 million: video training dataset labeled with 4800 labels; 180: Amazon warehouses in the US; 10: bits of info per photon; 16: GPUs in new AI killer P2 instance type;

  • Quotable Quotes:
    • @markmccaughrean: 1,000,000 people to Mars in 100 yrs. 10 people/launch? That's 3 a day, every day, for a century. 1% failure rate? One explosion every month
    • @jeremiahg: Any sufficiently advanced exploit is indistinguishable from a 400lb hacker.
    • BrianKrebs: I suggested to Mr. Wright perhaps a better comparison was that ne’er-do-wells now have a virtually limitless supply of Stormtrooper clones that can be conscripted into an attack at a moment’s notice.
    • Sonia: Academia’s not-so-subtle distain for applied research does more than damage a few promising careers; it renders our field’s output useless, destined to collect dust on the shelves of Elsevier. 
    • Monica L. Smith: Nobody builds their own infrastructure. You don’t build your own highway, train line, water pipe, your own sewer. Those are things that connect you and your household to everybody else sequentially in your neighborhood, in your region, from the city out into the broader hinterlands.
    • @olesovhcom: This botnet with 145607 cameras/dvr (1-30Mbps per IP) is able to send >1.5Tbps DDoS. Type: tcp/ack, tcp/ack+psh, tcp/syn.
    • kenrose: We see this pattern at PagerDuty over the majority of our customers. There is a definite lull in alert volume over the weekends that picks up first thing Monday morning.It's led to my personal conclusion that most production issues are caused by people, not errant hardware or systems.
    • @rseroter: "We Crammed this Monolith Into a Container and Called it a Microservice"
    • @mweagle: I really don’t want to run my own k8s in AWS, but ECS is so opaque to debug that k8s seems like a good choice.
    • Werner Vogels~ We have this overarching goal which is customer centricity. Doing anything that benefits the customer gets priority above everything else. Working on eliminating all single points of failure in the company purely benefits the customer because it really improves the customer experience.
    • Cory Doctorow~ The thing open source software had going for it was the Ulysses Pact...the  irrevocable license, the failure mode of open source software, having founded an open source software company, I can tell you there are moments where it feels like your survival turns on being able to close the code you had opened when you were idealistic. There are moments of desperation when that happens. 
    • @lightbend: "We've been using #Akka in production for over two years, without a single crash." -@CruiseNorwegian |
    • @cloud_opinion: Monolithic -> Microservices -> "which container image?" -> "Screw it, lets do PaaS" ->  CF  or AWS?
    • Etsy: concurrency proved to be great for logical aggregation of components, and not so great for performance optimization. Better database access would be better for that.
    • Yaniv Nizan: the number of users actually contributing ad revenue in your app is a lot lower than 6.5% and much closer to the 1% or 2% that contribute revenue from In-app purchases. 
    • @reckless: Elon is basically putting on an Apple event, for going to Mars.
    • @potch: DRY: Don't Repeat Yourself / DAMP: Do Abstraction/Minimalism Pragmatically / MOIST: Maybe Only Innovate Some Times?
    • @dannysullivan: In the Facebook video metrics thing, spare a thought for the poor BuzzFeed watermelon, less viral than it thought :)
    • Addison Snell: If the promise of cloud computing is overblown, it because of the amplification it gets from its loyal converts, enterprises who have found liberation and agility in outsourcing IT. 
    • @psaffo: In 1990, the size of the US software industry was $3.2 billion -- the same size as the gourmet popcorn industry in that same year.
    • David Rosenthal: [Storage] Revenues are flat or decreasing, profits are decreasing for both companies. These do not look like companies faced by insatiable demand for their products; they look like mature companies facing increasing difficulty in scaling their technology.
    • @legind: Let's Encrypt now the 3rd largest CA, after Comodo and Symantec, comprising over 13% of the SSL cert market share 
    • @stewartbrand: “In the long run, the technology driving activities in space will be biological.” Rousing essay by Freeman Dyson.
    • @jessitron: Constructing causal ordering at the generic level of "all messages received cause all future messages sent" is expensive and also less meaningful than a business-logic-aware, conscious causal ordering. This conscious causal ordering gives us external consistency, accurate legibility, and visibility into what we know to be causal.

  • In an article light on details, written more with a marketing flourish, we still learn some interesting details on the infrastructure behind Pokemon Go. Bringing Pokémon GO to life on Google Cloud. It runs on Google Cloud, Kubernetes, Google Container Engine, HTTP/S Load Balancer, and Cloud Datastore. Keep in mind Alphabet is invested in Niantic and Ingress, the forerunner of Pokemon Go, ran on App Engine. So it sounds like a new backend implementation that had to scale from zero to the size of Twitter in a matter of weeks, with a much more complicated work load. Growth was explosive. Player traffic was 50x larger than initial estimates. An implication is the problems experienced during launch were not infrastructure related. Google, in the form of Customer Reliability Engineer (CRE), worked closely with Niantic to make sure the infrastructure scaled. The problems must have been elsewhere in the application stack, which is perfectly understandable. That sort of load could not have been predicted. The design decisions you make for 5x expected traffic are very different than they are for 50x. Nobody will spend the money or take the time to build a system for 50x. Nobody. Lots of good comments on HackerNews. Good question by ksec, would Poekemon Go even be possible in a pre-cloud era? 

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: High Scalability

How Uber Manages a Million Writes Per Second Using Mesos and Cassandra Across Multiple Datacenters

High Scalability - Wed, 2016-09-28 15:59

If you are Uber and you need to store the location data that is sent out every 30 seconds by both driver and rider apps, what do you do? That’s a lot of real-time data that needs to be used in real-time.

Uber’s solution is comprehensive. They built their own system that runs Cassandra on top of Mesos. It’s all explained in a good talk by Abhishek Verma, Software Engineer at Uber: Cassandra on Mesos Across Multiple Datacenters at Uber (slides).

Is this something you should do too? That’s an interesting thought that comes to mind when listening to Abhishek’s talk.

Developers have a lot of difficult choices to make these days. Should we go all in on the cloud? Which one? Isn’t it too expensive? Do we worry about lock-in? Or should we try to have it both ways and craft brew a hybrid architecture? Or should we just do it all ourselves for fear of being cloud shamed by our board for not reaching 50 percent gross margins?

Uber decided to build their own. Or rather they decided to weld together their own system by fusing together two very capable open source components. What was needed was a way to make Cassandra and Mesos work together, and that’s what Uber built.

For Uber the decision is not all that hard. They are very well financed and have access to the top talent and resources needed to create, maintain, and update these kind of complex systems.

Since Uber’s goal is for transportation to have 99.99% availability for everyone, everywhere, it really makes sense to want to be able to control your costs as you scale to infinity and beyond.

But as you listen to the talk you realize the staggering effort that goes into making these kind of systems. Is this really something your average shop can do? No, not really. Keep this in mind if you are one of those cloud deniers who want everyone to build all their own code on top of the barest of bare metals.

Trading money for time is often a good deal. Trading money for skill is often absolutely necessary.

Given Uber’s goal of reliability, where out of 10,000 requests only one can fail, they need to run out of multiple datacenters. Since Cassandra is proven to handle huge loads and works across datacenters, it makes sense as the database choice.  

And if you want to make transportation reliable for everyone, everywhere, you need to use your resources efficiently. That’s the idea behind using a datacenter OS like Mesos. By statistically multiplexing services on the same machines you need 30% fewer machines, which saves money. Mesos was chosen because at the time Mesos was the only product proven to work with cluster sizes of 10s of thousands of machines, which was an Uber requirement. Uber does things in the large.

What were some of the more interesting findings?

  • You can run stateful services in containers. Uber found there was hardly any difference, 5-10% overhead, between running Cassandra on bare metal versus running Cassandra in a container managed by Mesos.

  • Performance is good: mean read latency: 13 ms and write latency: 25 ms, and P99s look good.

  • For their largest clusters they are able to support more than a million writes/sec and ~100k reads/sec.

  • Agility is more important than performance. With this kind of architecture what Uber gets is agility. It’s very easy to create and run workloads across clusters.

Here’s my gloss of the talk:

In the Beginning
Categories: High Scalability

Sponsored Post: ScaleArc, Spotify, Aerospike, Scalyr, Gusto, VividCortex, MemSQL, InMemory.Net, Zohocorp

High Scalability - Tue, 2016-09-27 15:56

Who's Hiring?
  • Spotify is looking for individuals passionate in infrastructure to join our Site Reliability Engineering organization. Spotify SREs design, code, and operate tools and systems to reduce the amount of time and effort necessary for our engineers to scale the world’s best music streaming product to 40 million users. We are strong believers in engineering teams taking operational responsibility for their products and work hard to support them in this. We work closely with engineers to advocate sensible, scalable, systems design and share responsibility with them in diagnosing, resolving, and preventing production issues. We are looking for an SRE Engineering Manager in NYC and SREs in Boston and NYC.

  • IT Security Engineering. At Gusto we are on a mission to create a world where work empowers a better life. As Gusto's IT Security Engineer you'll shape the future of IT security and compliance. We're looking for a strong IT technical lead to manage security audits and write and implement controls. You'll also focus on our employee, network, and endpoint posture. As Gusto's first IT Security Engineer, you will be able to build the security organization with direct impact to protecting PII and ePHI. Read more and apply here.

Fun and Informative Events
  • Learn how Nielsen Marketing Cloud (NMC) leverages online machine learning and predictive personalization to drive its success in a live webinar on Tuesday, September 20 at 11 am PT / 2 pm ET. Hear from Nielsen’s Kevin Lyons, Senior VP of Data Science and Digital Technology, and Brent Keator, VP of Infrastructure, as well as from Brian Bulkowski, CTO and Co-Founder at Aerospike, as they describe the front-edge architecture and technical choices – including the Aerospike NoSQL database – that have led to NMC’s success. RSVP: https://goo.gl/xDQcu4
Cool Products and Services
  • ScaleArc's database load balancing software empowers you to “upgrade your apps” to consumer grade – the never down, always fast experience you get on Google or Amazon. Plus you need the ability to scale easily and anywhere. Find out how ScaleArc has helped companies like yours save thousands, even millions of dollars and valuable resources by eliminating downtime and avoiding app changes to scale. 

  • Scalyr is a lightning-fast log management and operational data platform.  It's a tool (actually, multiple tools) that your entire team will love.  Get visibility into your production issues without juggling multiple tabs and different services -- all of your logs, server metrics and alerts are in your browser and at your fingertips. .  Loved and used by teams at Codecademy, ReturnPath, Grab, and InsideSales. Learn more today or see why Scalyr is a great alternative to Splunk.

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

  • VividCortex measures your database servers’ work (queries), not just global counters. If you’re not monitoring query performance at a deep level, you’re missing opportunities to boost availability, turbocharge performance, ship better code faster, and ultimately delight more customers. VividCortex is a next-generation SaaS platform that helps you find and eliminate database performance problems at scale.

  • MemSQL provides a distributed in-memory database for high value data. It's designed to handle extreme data ingest and store the data for real-time, streaming and historical analysis using SQL. MemSQL also cost effectively supports both application and ad-hoc queries concurrently across all data. Start a free 30 day trial here: http://www.memsql.com/

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • www.site24x7.com : Monitor End User Experience from a global monitoring network. 

If any of these items interest you there's a full description of each sponsor below...

Categories: High Scalability

Stuff The Internet Says On Scalability For September 23rd, 2016

High Scalability - Mon, 2016-09-19 04:26

Hey, it's HighScalability time:

 

Will Minority Report for developers really help us program better? (Primitive)

 

If you like this sort of Stuff then please support me on Patreon.
  • October 2017: ICANN changes the DNSSEC root keys; $2.91M: cost of running Let's Encrypt; 20%: Amazon convenience tax; 100%: increase in spam; 6.2 km: Quantum teleportation across a metropolitan fibre network; March 18, 1982: birth of containers; 6 months: how long a lightening bolt can power a 60 watt bulb; trillions: EV cache hits per day @ Netflix; 5x: Spark is faster than MapReduce; billions: HTTP, Git and SSH connections served per day at GitHub; 28: # of websites in North Korea; 

  • Quotable Quotes:
    • @vgcerf: It is time to admit after 18 years that the multistakeholder model of Internet operation works. #yestoIANA
    • @EricLathrop: Netflix found a 5x performance variation between AWS instances at the same price! They benchmark to avoid overpaying. @indirect #Strangeloop
    • @swardley: Perfectly reasonable @NigelBarron. Larry's statements are ludicrous, play is to milk existing customers whilst hoping to find a new future.
    • @BethanyMacri: Etsy is very anti-SOA. Monolith forever!
    • janfoeh: I've said it before here and I'll say it again: the JS ecosystem is moving in the wrong direction. Sometimes I feel that with Javascript, we developers have taken something that wasn't ours, and we're in the process of destroying the best thing there ever was about it. So here we are, the single <script> tag having been replaced with compilers, transpilers, five mutually incompatible build systems, three different module systems in God knows how many implementations, frameworks changing their API every ten minutes and five thousand lines of NPM module code to be installed for even the simplest of tasks.
    • marknadal: This is the way humans have been thinking for thousands of years. And guess what, I sat down with a large airline and had to warn them "we're not Strongly Consistent" and they laughed at me saying "you realize we've been booking seat reservations before there was internet, before you were born, and before there was cheap telephony. Seat reservation has never been strongly consistent - we used to have hundreds of travel agents booking seats and it would take 2 weeks before we would hear about it."
    • Jason Feifer: All I have to do is go to another website and see the price is different, and I don't. It's crazy. Like, why am I not doing that? We're the problem.
    • @cmeik: "The clock-free design paradigm I promote must eventually prevail. It fits Physics."
    • @gabrielgironda: mclaren and apple are a great fit. all the stability of apple's software combined with the reliability of british automobiles
    • Bryan Cantrill: The virtual machine is vestigial abstraction. We can not get to #serverless without getting rid of of the VM.
    • @dchetwynd: The number of US households that only use cellular data has doubled from 10% to 20% between 2013 and 2016 #strangeloop
    • There are even more awesome Quotable Quotes in the full article.

  • Interesting results from a major architecture change at Netflix. Zuul 2 : The Netflix Journey to Asynchronous, Non-Blocking Systems. Netflix had a blocking servlet connectionless based architecture and they moved to a nonblocking asynchronous connection architecture. In general, from a latency, CPU, throughput, and capacity perspective the async version didn't perform much better than the old sync version. Netflix found "the less work a system actually does, the more efficiency we gain from async", which makes sense in terms of scheduling and IO. There was a big win however in the ability to scalably maintain over 83 million persistent connections, one for every client, back into their cloud infrastructure. The cost of a connection becomes a file descriptor instead of a thread, which is a lot cheaper. By using a persistent connect Netlfix can reduce overall device requests, improve device performance, understand and debug the customer experience better, enable more real-time user experience innovations, and reduce overall cloud costs by replacing “chatty” device protocols today (which account for a significant portion of API traffic) with push notifications. Operations did take a hit. Sync systems are much easier to understand and debug. Also, making the migration was not easy. Changing sync code to async is not for the faint-hearted. 

  • This is hilarious. Read the whole thread. You won't be disappointed. @stef: You are in a startup. All around is a burning runway. There are exits to the North and East. You have a bootstrap. There is a VC here.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: High Scalability

Stuff The Internet Says On Scalability For September 16th, 2016

High Scalability - Fri, 2016-09-16 15:56

Hey, it's HighScalability time:

 

The struggle for life that kills. Stunning video of bacteria mutating to defeat antibiotics. 

 

If you like this sort of Stuff then please support me on Patreon.
  • 60%: time spent cleaning dirty dirty BigData; 10 million: that's a lot of Raspberry Pi; 365: days living in a Mars simulation; 100M: monthly League of Legends players; 1.75 billion: copyright takedowns by Google; 3.5 petabytes: data Evernote has to move to Google cloud; 11%: YoY growth in time spent on mobile apps; 4 hours: time between Lambda coldstarts; 

  • Quotable Quotes:
    • Camille Fournier: humans struggle to tangibly understand domains that are theoretically separate when they are presented as colocated by the source code.
    • @songcarver: The better example: iPhone 7 is showing 115% of 2016 Macbook single core performance, 88% of multi-core.
    • ex3ndr: We (actor.im) also moved from google cloud to our servers + k8s. Shared persistent storage is a huge pain. We eventually stopped to try to do this, will try again when PetSets will be in Beta and will be able to update it's images.
    • @mcclure111: "Well maybe you should get your spaceship working before you try to implant nanites in your brain, DUDE"
    • IOpipe: Organizations I’ve spoken to have expressed an average of 10x cost savings over microservices-based infrastructure for the code they’ve moved to AWS Lambda.
    • avitzurel: Kube is winning for the same reason React/Redux (and now Mobx) is winning and why Rails was winning at the time. Community.
    • @etherealmind: Evernote is moving to public cloud. A strong sign that its in financial trouble, or lacking product direction.
    • @codinghorror: In 8 years of colocating servers I have seen multiple spinning rust disks fail, and one PSU, but zero SSDs failed from 2013-on.
    • Caltech: Now, with the new simulation—which used a network of thousands of computers running in parallel for 700,000 central processing unit (CPU) hours—Caltech astronomers have created a galaxy that looks like the one we live in today, with the correct, smaller number of dwarf galaxies.
    • Andy Grove: Rust is gearing up to be particularly suitable for building scalable asynchronous io and getting Rust onto servers is a great way to drive adoption of the language. 
    • James Hamilton: We have long believed that 80% of operations issues originate in design and development… When systems fail, there is a natural tendency to look first to operations since that is where the problem actually took place. Most operations issues, however, either have their genesis in design and development or are best solved there.
    • Google: even the possibility of a future quantum computer is something that we should be thinking about today.
    • Alan Kay: This doesn’t mean that “objects are now hidden”, but that they should be part of the “modeling and designing of ideas and processes” that is the center of what programming needs to be.
    • Packet Pushers: In the future the world be made of clouds and users. The user will be sitting in Starbucks and accessing the cloud and your network will be totally irrelevant.
    • StorageMojo: Our current system for the diffusion of knowledge is breaking down. How are we going to fix it?
    • Ron Miller: Flywheel Effect is the idea that once you have your core tech pieces in place, they have an energy of their own that drives other positive changes and innovations.
    • stonogo: Intel needs everything to be NUMA-aware. They're betting a lot of money on Xeon Phi, and once the self-booting KNL machines are out nobody will want to deal with the pcie cards any more.
    • @Fruzenshtein: It's strange to listen a talk about microservices when you have already heard about serverless architecture
Categories: High Scalability

If Traffic is an Iterated Prisoner's Dilemma Game Can Smart Cars Evolve Co-operative Behavior?

High Scalability - Tue, 2016-09-13 18:05

 

Can small tribes of cooperating smart cars improve overall traffic even if they are not in the majority? Sure, if every car was a self-driving car maybe traffic jams could dissolve like blood clots on anticoagulants, but what about that messy in-between period? It will be some time before smart cars rule the road. Until then can smart cars make traffic better?

Adoption is hard. This is a general problem in tech. You want people to join your social network yet people won't join until enough people have already joined. What you really want is that virtuous circle to develop, where as more people adopt a technology it causes even more people to adopt it. So startups spend their VC money fast and furiously in hopes of acquiring new customers betting the lifetime value of a customer will be worth the investment. VC money is the dead corpse that feeds the rest of the ecosystem.

Traffic is already an example of a vicious cycle. Horrendous traffic jams are now the norm and "good" traffic windows are just tall tales texted to children. And it keeps on getting worse and not in a worse is better sort of way. Yet the incentives are still not enough for people to self-organize and batch themselves into cars. Cars are more of a synchronous streaming model. Traffic problems will need to be solved at a different level of abstraction. Human drivers are just so hopelessly human.

In some ways traffic is like an iterated game of Prisoner's Dilemma. So in an Evolution of Cooperation sense can overall flows improve if groups of self-driving cars cooperate together within a stream of muggle cars? If smart cars on the road choose to gang up together will that improve commute times in such a way that it will encourage more and more cars to join the gang, becoming part of the solution instead of the problem?

But we have the social network problem. Cars currently are individual, kept in silos organized by manufacturer. Tesla, Uber, Google, etc. don't cooperate at a global traffic planning level. Even cars within a manufacturer don't yet have the ability to slave themselves together in a self-driving conga line of traffic goodness.

Historically we know after individual point solutions are created the next step is to add a scheduling layer. After running a program on an entire CPU we create an OS (Linux, Windows, etc) to run multiple programs on the same CPU. After the container we create an OS (Swarm, Kubernetes, Mesos, etc) to run multiple programs on the same boxes.

We'll need a TrafficOS so all the cars that want to can cooperate together, you know like XMPP before the walls went up. Plus we'll need ecosystem incentives to help drive adoption. 

So many questions. Will drivers volunteer to be part of a smart car peloton even if it means their commute suffers in the short term? What's the tipping point? Will free riders ruin the whole thing? Like the fast lane, should incentives be created to encourage cooperating tribes of smart cars? Should traffic lights favor smart car trains? Should traffic laws allow bullet trains of smart cars to speed down the highway? Should insurance premiums be reduced for time spent protected in smart car convoys? Maybe smart car software should be seeded with altruism "genes" so they cooperate naturally? How can defectors be punished? Maybe we need a reputation system scoring for traffic reciprocity?

Unlike the weather traffic is something we can do something about. Let's just try to do a better job than we did with social networks and IM systems. Traffic is actually important.

Related Articles
Categories: High Scalability

Sponsored Post: ScaleArc, Spotify, Aerospike, Scalyr, Gusto, VividCortex, MemSQL, InMemory.Net, Zohocorp

High Scalability - Tue, 2016-09-13 16:10

Who's Hiring?
  • Spotify is looking for individuals passionate in infrastructure to join our Site Reliability Engineering organization. Spotify SREs design, code, and operate tools and systems to reduce the amount of time and effort necessary for our engineers to scale the world’s best music streaming product to 40 million users. We are strong believers in engineering teams taking operational responsibility for their products and work hard to support them in this. We work closely with engineers to advocate sensible, scalable, systems design and share responsibility with them in diagnosing, resolving, and preventing production issues. We are looking for an SRE Engineering Manager in NYC and SREs in Boston and NYC.

  • IT Security Engineering. At Gusto we are on a mission to create a world where work empowers a better life. As Gusto's IT Security Engineer you'll shape the future of IT security and compliance. We're looking for a strong IT technical lead to manage security audits and write and implement controls. You'll also focus on our employee, network, and endpoint posture. As Gusto's first IT Security Engineer, you will be able to build the security organization with direct impact to protecting PII and ePHI. Read more and apply here.

Fun and Informative Events
  • Learn how Nielsen Marketing Cloud (NMC) leverages online machine learning and predictive personalization to drive its success in a live webinar on Tuesday, September 20 at 11 am PT / 2 pm ET. Hear from Nielsen’s Kevin Lyons, Senior VP of Data Science and Digital Technology, and Brent Keator, VP of Infrastructure, as well as from Brian Bulkowski, CTO and Co-Founder at Aerospike, as they describe the front-edge architecture and technical choices – including the Aerospike NoSQL database – that have led to NMC’s success. RSVP: https://goo.gl/xDQcu4
Cool Products and Services
  • ScaleArc's database load balancing software empowers you to “upgrade your apps” to consumer grade – the never down, always fast experience you get on Google or Amazon. Plus you need the ability to scale easily and anywhere. Find out how ScaleArc has helped companies like yours save thousands, even millions of dollars and valuable resources by eliminating downtime and avoiding app changes to scale. 

  • Scalyr is a lightning-fast log management and operational data platform.  It's a tool (actually, multiple tools) that your entire team will love.  Get visibility into your production issues without juggling multiple tabs and different services -- all of your logs, server metrics and alerts are in your browser and at your fingertips. .  Loved and used by teams at Codecademy, ReturnPath, Grab, and InsideSales. Learn more today or see why Scalyr is a great alternative to Splunk.

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

  • VividCortex measures your database servers’ work (queries), not just global counters. If you’re not monitoring query performance at a deep level, you’re missing opportunities to boost availability, turbocharge performance, ship better code faster, and ultimately delight more customers. VividCortex is a next-generation SaaS platform that helps you find and eliminate database performance problems at scale.

  • MemSQL provides a distributed in-memory database for high value data. It's designed to handle extreme data ingest and store the data for real-time, streaming and historical analysis using SQL. MemSQL also cost effectively supports both application and ad-hoc queries concurrently across all data. Start a free 30 day trial here: http://www.memsql.com/

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • www.site24x7.com : Monitor End User Experience from a global monitoring network. 

If any of these items interest you there's a full description of each sponsor below...

Categories: High Scalability

The Dollar Shave Club Architecture Unilever Bought for $1 Billion

High Scalability - Tue, 2016-09-13 15:56

This is a guest post by Jason Bosco, the Dollar Shave Club’s Director of Engineering, Core Platform & Infrastructure, on the infrastructure of its ecommerce technology.

With more than 3 million members, Dollar Shave Club will do over $200 million in revenue this year. Although most are familiar with the company’s marketing, this immense growth in just a few years since launch is largely due to its team of 45 engineers.

Dollar Shave Club engineering by the numbers:

Core Stats
Categories: High Scalability

Stuff The Internet Says On Scalability For September 9th, 2016

High Scalability - Thu, 2016-09-08 15:37

Hey, it's HighScalability time:

 

An alternate universe where Zeppelins rule the sky. 1929. (@AeroDork)

 

If you like this sort of Stuff then please support me on Patreon.
  • 15%: Facebook's reduction in latency using HTTP2's server push; 1.9x: nanotube transistors outperform silicon; 200: projectors used to film a "hologram"; 50%: of people fall for phishing attacks (it's OK to click); 5x: increased engagement using Google's Progressive Web Apps; 115,000+: Cassandra nodes at Apple; $500 million: Pokémon Go; $150M: Delta's cost for datacenter outage; 

  • Quotable Quotes: 
    • Dan Lyons: I wanted to write a book about what it’s like to be 50 and trying to reinvent yourself – that struggle. There are all these books and inspirational speakers talking about being a lifelong learner and it’s so great to reinvent yourself, the brand of you. And I wanted to say, you know, it’s not like that. It’s actually really painful.
    • Engineers & Coffee~ In modern application development everything is a stream now versus historically everything was a transaction. Make a request and the you're done. It's easier to write analytics on top of streams versus using Hive. It's cool that Kinesis is all real-time and has the power of SQL.
    • David Smith: The [iOS] market has been pulling me along towards advertising based apps, and I’ve found that the less I fight back with anachronistic ideas about how software “should” be sold, the more sustainable a business I have.
    • @tef_ebooks: (how do you keep a lisp user in suspense
    • @bodil: Use tests to verify your assumptions. Use a type checker to verify your implementations. Always.
    • tostitos1979: Here is a factoid for the youngins ... the Internet/Arpanet was created BEFORE the first microprocessor! In fact, Intel was originally founded to make RAM ICs. They only later created the first microprocessor (the 4004)!
    • gsubes:  Our tests showed than even with larger messages (100k price ticks per request) pipes were still a magnitude slower [than Memory Mapping].
    • Quincy Larson: Did you know the average developer only get two hours of uninterrupted work done a day? They spend the other 6 hours in varying states of distraction.
    • StorageMojo: Achieving lower-than-DRAM pricing requires volume, and that’s where NRAM has a competitive advantage over, say, 3D XPoint. Processing can be done on today’s flash, DRAM or logic lines. NRAM processing only needs spin coating and patterning – as well as carbon nanotubes – which modern fabs all support.
    • Xiao Mina: We’ve seen this story before: as cost of production and distribution go down, the range of creativity goes up.
    • @clarkkaren: Give humans a system and they'll game it. The End.
    • Jim Starkey: AmorphousDB is my modest effort to question everything database. The best way to think about Amorphous is to envision a relational database and mentally erase the boxes around the tables so all records free float in the same space – including data and metadata.
    • @jdub: On Reddit: “What is the use of Elastic IPs, if I can use ELB or an Auto Scaling Group instead?” STUDENT, YOU HAVE ACHIEVED ZEN OF CLOUD.
    • @BenedictEvans: A key premise for the next decade: it's easier for software to enter other industries than for other industries to hire software people
    • @jasongorman: To clarify, "dependency injection" literally just means passing an object's collaborators as constructor/method params. That's all it is.
    • jackpeterfletch: Grand solution to world hunger, available on Kindle!
    • @swardley: Optimise flow.  Often when you examine flows then you’ll find bottlenecks, inefficiencies and profitless flows.  There will be things that you’re doing that you just don’t need to. Be very careful here to consider not only efficiency but effectiveness. 
    • @PatrickMcFadin: #uber is fully replicated and active-active to make sure you never get stranded. #cassandrasummit
    • @FSVO: A monk named Chaitin found an algorithm for expressing the complexity of sutras. His master commented, “This monk could be shorter.”
    • Dotzler: We [Firefox] can learn from the competition [Chrome]. The way they implemented multi-process is RAM-intensive, it can get out of hand. We are learning from them and building an architecture that doesn’t eat all your RAM. 
    • @hichaelmart: Although CPU bound calculations [on OpenWhisk] seem about 4x slower than Lambda, so not too bad. Lambda still the winner so far though.
    • Shel Kaphan: Okay, I’m going to be building this website to run a bookstore [Amazon] and I haven’t done that before but it doesn’t sound so hard. When I’m done with that I’m not sure what I’ll do.
    • sixhobbits: "Our logger failed silently" "Shouldn't that have been recorded somewhere?" "I guess it's turtles all the way down"
    • @xmal: Trying to explain that CRDT causal contexts are a natural evolution of TCP sequence numbering and vector clocks in reliable causal broadcast
    • Joi Ito: Just like it is impossible to make another Silicon Valley somewhere else, although everyone tries—after spending four days in Shenzhen, I’m convinced that it’s impossible to reproduce this ecosystem anywhere else.
    • @adriancolyer: "My claim is that it is possible to write grand programs, noble programs, truly magnificent ones..." Knuth 1974
    • @Excellion: According to legend, if you say Blockchain three times fast, your databases will magically become immutable & your company a fintech leader.
    • bec0: The world has changed. Dennard scaling has mostly been replaced. The economic Moore's Law has morphed. It had too...we have all gotten used to its benefits.
    • @cloud_opinion: 5 stages of Cloud Grief: It's not secure / It's someone's computer / We do private cloud / Hybrid cloud  / Lambda is full of servers anyway
    • @DDD_Borat: "Why you not like framework annotations in your code?" - "Would you put bumper sticker on a Ferrari?" Rofl
    • @robert_winslow: Slow software is your fault. These are the real speed limits: billions of CPU instructions, GBs of RAM access, 100k+ SSD I/Os... per second.
    • Walter Bentley: I am proud to say, OpenStack held up to the torment. Did not experience not one single API request failure throughout my numerous load tests — yet another proof point that OpenStack is ready for enterprise/production use.
    • @xaprb: Let's fork it, say the people who have never put their heart and 5 years of their life into a product only to watch someone else fork it.
    • @adrianco: People asking Docker to slow down is like OpenStack folks asking AWS to standardize and slow down.
    • @amcafee: "In 1974, it was illegal for an airline to charge < $1,442 for a flight between New York City and Los Angeles."
    • Fairly Nerdy: For most real world scenarios, where you are betting against the house which has a house edge, f* becomes negative, which means that you shouldn’t be playing that game.  Truthfully it means that you should take the other side of the wager, become the house, and make them bet against you!
    • Judd Kaiser: Experience shows that good scalability can be achieved on 10 GigE networking provided that you stay above about 50,000 cells per core. That means, for example, that a 20 M cell problem shows good scaling up to about 400 cores; beyond that, interprocess communication latency begins to dominate and scaling degrades.

  • Maybe the real reason Uber wants driverless cars is hiring, er...onboarding drivers from across the globe is a really tough problem to solve. Each location has their own processes and that kills scalability. Screening processes and regulations vary, some countries have a very long list of required documents, and onboarding flows vary. Here's the story: How Uber Engineering Massively Scaled Global Driver Onboarding. So you can't use the same app everywhere. The solution was, as it often is, is to go meta and dynamic: the onboarding state machine (OSM)  easily configure a set of steps for each onboarding process in each country, state, city, or any level of granularity we need, coupled with an event system that allows us to easily switch users from one step to another depending on their actions or input. The onboarding API can then easily query the OSM to know at which step in the process a user is.  Clients are now stateless,  responsible only for their UI, 100% of the business logic in the shared back end. They went from Flask to Tornado and a lighter version of their initial JSON schema architecture, where only data is passed to the client, not UI definitions.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: High Scalability

Code Generation: The Inner Sanctum of Database Performance

High Scalability - Wed, 2016-09-07 16:14

This is guest post by Drew Paroski, architect and engineering manager at MemSQL. Previously he worked at Facebook and developed HHVM, the popular real-time PHP compiler used across the company’s web scale application.

Achieving maximum software efficiency through native code generation can bring superior scaling and performance to any database. And making code generation a first-class citizen of the database, from the beginning, enables a rich set of speed improvements that provide benefits throughout the software architecture and end-user experience.

If you decide to build a code generation system you need to clearly understand the costs and benefits, which we detail in this article. If you are willing to go all the way in the name of performance, we also detail an approach to save you time leveraging existing compiler tools and frameworks such as LLVM in a proven and robust way.

Code Generation Basics
Categories: High Scalability

Stuff The Internet Says On Scalability For September 2nd, 2016

High Scalability - Fri, 2016-09-02 15:56

Hey, it's HighScalability time:

 

Spectacular iconic drawing of Aurora Borealis as observed in 1872. (Drawings vs. NASA Images)
  • 4,000 GB: projected bandwidth used per autonomous vehicle per day; 100K: photos of US national parks; 14 terabytes: code on Github in 1 billion files held in 400K repositories; 25: age of Linux; $5 billion: cost of labor for building Linux; $3800: total maintenance + repairs after 100K miles and 5 years of Tesla ownership; 2%: reduction in Arizona's economy by deporting all illegal immigrants; 15.49TB: available research data; 6%: book readers who are "digital only";

  • Quotable Quotes
    • @jennyschuessler: "Destroy the printing press, I beg you, or these evil men will triumph": Venice, 1473
    • @Carnage4Life: Biggest surprise in this "Uber for laundry" app shutting down is that there are still 3 funded startups in the space
    • @tlipcon: "backpressure" is right up there with "naming things" on the top 10 list of hardest parts of programming
    • cmcluck: Please consider K8s [kubernetes] a legitimate attempt to find a better way to build both internal Google systems and the next wave of cloud products in the open with the community. We are aware that we don't know everything and learned a lot by working with people like Clayton Coleman from Red Hat (and hundreds of other engineers) by building something in the open. I think k8s is far better than any system we could have built by ourselves. And in the end we only wrote a little over 50% of the system. Google has contributed, but I just don't see it as a Google system at this point.
    • looncraz: AMD is not seeking the low end, they are trying to redefine AMD as the top-tier CPU company they once were. They are aiming for the top and the bulk of the market.
    • lobster_johnson: Swarm is simple to the point of naivety.
    • @BenedictEvans: That is, vehicle crashes, >90% caused by human error & 30-40% by alcohol, cost $240bn & kill 30k each year just in the USA. Software please
    • @joshsimmons: "Documentation is like serializing your mental state." - @ericholscher, just one of many choice moments in here.
    • @ArseneEdgar: "better receive old data fast rather than new data slow"
    • @aphyr: hey if you're looking for a real cool trip through distributed database research, https://github.com/cockroachdb/cockroach/blob/develop/docs/design.md … is worth several reads
    • @pwnallthethings: It's a fact 0day policy-wonks consistently get wrong. 0day are merely lego bricks. Exploits are 0day chains. Mitigations make chains longer.
    • andrewguenther: Speaking of [Docker] 1.12, my heart sank when I saw the announcement. Native swarm adds a huge level of complexity to an already unstable piece of software. Dockercon this year was just a spectacle to shove these new tools down everyone's throats and really made it feel like they saw the container parts of Docker as "complete." 
    • @johnrobb: Foxconn just replaced 60,000 workers with robots at its Kushan facility in China.  600 companies follow suit.
    • @epaley: Well publicized - Uber has raised ~$15B. Yet the press is shocked @Uber is investing billions. Huh? What was the money for? Uber kittens?
    • Ivan Pepelnjak: One of the obsessions of our industry is to try to find a one-size-fits-everything solutions. It's like trying to design something that could be a mountain bike today and an M1 Abrams tomorrow. Reality doesn't work that way
    • There were so many good quotes this week that they wouldn't all fit here. Please see the full post to read all the wonderfulness.

  • This should concern every iPhone user. Total ownage.
    • Steve Gibson, Security Now 575, with a great explanation of Apple's previously unknown professional grade zero-day iPhone exploits, Pegasus & Trident, that use a chain of flaws to remotely jail break an iPhone. It's completely stealthy, surviving both reboots and upgrades. The exploits have been around for years and were only identified by accident. It's a beautiful hack.
    • Your phone is totally open and it happens just like in the movies: A user infected with this spyware is under complete surveillance by the attacker because, in addition to the apps listed above, it also spies on: Phone calls, Call logs,  SMS messages the victim sends or receives, Audio and video communications that (in the words a founder of NSO Group) turns the phone into a 'walkie­talkie'
    • Bugs happen in complicated software. Absolutely. But these exploits were for sale...for years. The companies that sell these exploits do not have to disclose them. Apple should be going to the open market and buying these exploits so they can learn about them and fix them. Apple should be outbidding everyone in their bug bounty system so they can find hacks and fix them.
    • Paying for exploits is not an ethical issue, it's smart business in a realpolitik world. If you can figure out the Double Irish With a Dutch Sandwich you can figure out how to go to the open market and find out all the ways you are being hacked. Apple needs to think about security strategically, not only as a tactical technical issue

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: High Scalability

My Test Tube Filled with DNA is Better than Your Mesos Cluster

High Scalability - Wed, 2016-08-31 16:10

 

We’ve seen computation using slime mold, soap film, water droplets, there’s even a 10,000 Domino Computer. Now DNA can do math In a test tube. Using addition, subtraction, multiplication, and division.

It’s not fast. Calculations can take hours. The upside: they are tiny and can work in wet environments. Think of running calculations in your bloodstream or in cells, like a programmable firewall, to monitor and alert on targeted health metrics and then trigger a localized response. Or if you are writing  science fiction perhaps the ocean could become one giant computer?

The applications already sound like science fiction:

Prior devices for control of chemical reaction networks and DNA doctor applications have been limited to finite-state control, and analog DNA circuits will allow much more sophisticated analog signal processing and control. DNA robotics have allowed devices to operate autonomously (e.g., to walk on a nanostructure) but also have been limited to finite-state control. Analog DNA circuits can allow molecular robots to include real-time analog control circuits to provide much more sophisticated control than offered by purely digital control. Many artificial intelligence systems (e.g., neural networks and probabilistic inference) that dynamically learn from environments require analog computation, and analog DNA circuits can be used for back-propagation computation of neural networks and Bayesian probabilistic inference systems. How does it work?
Categories: High Scalability

The cat-and-mouse story of implementing anti-spam for Mail.Ru Group’s email service and what Tarantool has to do with this

High Scalability - Tue, 2016-08-30 15:56

Hey guys!

In this article, I’d like to tell you a story of implementing the anti-spam system for Mail.Ru Group’s email service and share our experience of using the Tarantool database within this project: what tasks Tarantool serves, what limitations and integration issues we faced, what pitfalls we fell into and how we finally arrived to a revelation.

Let me start with a short backtrace. We started introducing anti-spam for the email service roughly ten years ago. Our first filtering solution was Kaspersky Anti-Spam together with RBL (Real-time blackhole list — a realtime list of IP addresses that have something to do with spam mailouts). This allowed us to decrease the flow of spam messages, but due to the system’s inertia, we couldn’t suppress spam mailouts quickly enough (i.e. in the real time). The other requirement that wasn’t met was speed: users should have received verified email messages with a minimal delay, but the integrated solution was not fast enough to catch up with the spammers. Spam senders are very fast at changing their behavior model and the outlook of their spam content when they find out that spam messages are not delivered. So, we couldn’t put up with the system’s inertia and started developing our own spam filter...

Categories: High Scalability

Sponsored Post: Spotify, Aerospike, Exoscale, Host Color, Scalyr, Gusto, LaunchDarkly, VividCortex, MemSQL, InMemory.Net, Zohocorp

High Scalability - Tue, 2016-08-30 15:56

Who's Hiring?
  • Spotify is looking for individuals passionate in infrastructure to join our Site Reliability Engineering organization. Spotify SREs design, code, and operate tools and systems to reduce the amount of time and effort necessary for our engineers to scale the world’s best music streaming product to 40 million users. We are strong believers in engineering teams taking operational responsibility for their products and work hard to support them in this. We work closely with engineers to advocate sensible, scalable, systems design and share responsibility with them in diagnosing, resolving, and preventing production issues. We are looking for an SRE Engineering Manager in NYC and SREs in Boston and NYC.

  • IT Security Engineering. At Gusto we are on a mission to create a world where work empowers a better life. As Gusto's IT Security Engineer you'll shape the future of IT security and compliance. We're looking for a strong IT technical lead to manage security audits and write and implement controls. You'll also focus on our employee, network, and endpoint posture. As Gusto's first IT Security Engineer, you will be able to build the security organization with direct impact to protecting PII and ePHI. Read more and apply here.

Fun and Informative Events
  • High-Scalability Database Beer Bash. Come join Aerospike and like-minded peers on Wednesday, September 7 from 6:30-8:30 PM in San Jose, CA for an informal meet-up of great food and libations. You'll have the chance to learn about Aerospike's high-performance NoSQL database for mission-critical applications, and about the use cases of the companies switching to Aerospike from first-generation NoSQL databases such as Cassandra and Redis. Feel free to invite colleagues and peers! RSVP: bit.ly/DBbeer
Cool Products and Services
  • Do you want a simpler public cloud provider but you still want to put real workloads into production? Exoscale gives you VMs with proper firewalling, DNS, S3-compatible storage, plus a simple UI and straightforward API. With datacenters in Switzerland, you also benefit from strict Swiss privacy laws. From just €5/$6 per month, try us free now.

  • High Availability Cloud Servers in Europe: High Availability (HA) is very important on the Cloud. It ensures business continuity and reduces application downtime. High Availability is a standard service on the European Cloud infrastructure of Host Color, active by default for all cloud servers, at no additional cost. It provides uniform, cost-effective failover protection against any outage caused by a hardware or an Operating System (OS) failure. The company uses VMware Cloud computing technology to create Public, Private & Hybrid Cloud servers. See Cloud service at Host Color Europe.

  • Dev teams are using LaunchDarkly’s Feature Flags as a Service to get unprecedented control over feature launches. LaunchDarkly allows you to cleanly separate code deployment from rollout. We make it super easy to enable functionality for whoever you want, whenever you want. See how it works.

  • Scalyr is a lightning-fast log management and operational data platform.  It's a tool (actually, multiple tools) that your entire team will love.  Get visibility into your production issues without juggling multiple tabs and different services -- all of your logs, server metrics and alerts are in your browser and at your fingertips. .  Loved and used by teams at Codecademy, ReturnPath, Grab, and InsideSales. Learn more today or see why Scalyr is a great alternative to Splunk.

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

  • VividCortex measures your database servers’ work (queries), not just global counters. If you’re not monitoring query performance at a deep level, you’re missing opportunities to boost availability, turbocharge performance, ship better code faster, and ultimately delight more customers. VividCortex is a next-generation SaaS platform that helps you find and eliminate database performance problems at scale.

  • MemSQL provides a distributed in-memory database for high value data. It's designed to handle extreme data ingest and store the data for real-time, streaming and historical analysis using SQL. MemSQL also cost effectively supports both application and ad-hoc queries concurrently across all data. Start a free 30 day trial here: http://www.memsql.com/

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • www.site24x7.com : Monitor End User Experience from a global monitoring network. 

If any of these items interest you there's a full description of each sponsor below...

Categories: High Scalability

Stuff The Internet Says On Scalability For August 26th, 2016

High Scalability - Fri, 2016-08-26 15:56

Hey, it's HighScalability time:

 

 

The Pixar render farm in 1995 is half of an iPhone (@BenedictEvans)

 

If you like this sort of Stuff then please support me on Patreon.
  • 33.0%: of all retail goods sold online in the US are sold on Amazon;  110.9 million: monthly Amazon unique visitors; 21 cents: cost of 30K batch derived page views on Lambda; 4th: grade level of Buzzfeed articles; $1 trillion: home value threatened by rising sea levels; $1.2B: Uber lost $1.2B on $2.1B in revenue in H1 2016; 1.58 trillion: miles Americans drove through June; 

  • Quotable Quotes:
    • @bendystraw: My best technical skill isn't coding, it's a willingness to ask questions, in front of everyone, about what I don't understand
    • @vmg: "ls is the IDE of producing lists of filenames"
    • @nicklockwood: The hardest problem in computer science is fighting the urge to solve a different, more interesting problem than the one at hand.
    • @RexRizzo: Wired: "Machine learning will TAKE OVER THE WORLD!" Amazon: "We see you bought a wallet. Would you like to buy ANOTHER WALLET?"
    • @viktorklang: "The very existence of Ethernet flow control may come as a shock" - http://jeffq.com/blog/the-ethernet-pause-frame/ 
    • @JoeEmison: 4/ (c) if you need stuff on prem, keep it on prem. No need to make your life harder by hooking it up to some bullshit that doesn't work well
    • @grayj_: Also people envision more than you think. Wright Brothers to cargo flights: 7 yrs. Steam engine to car: 7 yrs.
    • David Wentzlaff: With Piton, we really sat down and rethought computer architecture in order to build a chip specifically for data centres and the cloud
    • @thenewstack: In 2015, there was 1 talk about #microservcies at OSCON; in 2016, there were 30: @dberkholz #CloudNativeDay
    • The Memory Guy: Now for the bad news: This new technology [3D XPoint] will not be a factor in the market if Intel and Micron can’t make it, and last week’s IDF certainly gave little reason for optimism.
    • @Carnage4Life: $19 billion just to link WhatsApp graph with Facebook's is mundane. Expect deeper, more insidious connections coming
    • Seth Lloyd~ The universe is a quantum computer. Biological life is all about extracting meaningful information from a sea of bits.
    • Facebookk: To automate such design changes, the team introduced new models to FBNet in which IPs and circuits were allocated using design tools based on predefined rules, and relevant config snippets were generated for deployment.
    • Robert Graham: Despite the fact that everybody and their mother is buying iPhone 0days to hack phones, it's still the most secure phone. Androids are open to any old hacker -- iPhone are open only to nation state hackers.
    • oppositelock: I'm a former Google engineer working at another company now, and we use http/json rpc here. This RPC is the single highest consumer of cpu in our clusters, and our scale isn't all that large. I'm moving over to gRPC asap, for performance reasons.
    • Gary Sims: The purposes and goals of Fuchsia are still a mystery, however it is a serious undertaking. Dart is certainly key, as is Flutter.
    • @mjpt777: "We haven't made all that much progress on parallel computing in all those years." - Barbara Liskov
    • @AnupGhosh_: Just another sleepy August: 1. NSA crown jewels hacked. 2. Apple triple 0-day weaponized. 3. Short selling vulnerabilities for fun & profit.
    • @JoeEmison: Hypothesis: enterprises adopted CloudFoundry because at least it gets up and running (cf OpenStack), but now finding it so inferior to AWS.
    • Robert Metcalfe: I predict the Internet will soon go spectacularly supernova and in 1996 catastrophically collapse.
    • Alan Cooper~ Form follows function to Hell. If you are building something out of bits what does form follows function mean? Function follows the user. If you are focussing on functions you are missing the point. 
    • @etherealmind: I've _never_ seen a successful outsourcing arrangement. And I've work on both sides in more than 10 companies.
    • @musalbas: Schools need to stop spending years teaching kids garbage Microsoft PowerPoint skills and teach them Unix sysadmin skills.
    • Dan Woods: With data lakes there’s no inherent way to prioritize what data is going into the supply chain and how it will eventually be used. The result is like a museum with a huge collection of art, but no curator with the eye to tell what is worth displaying and what’s not.
    • Jay Kreps: Unlike scalability, multi-tenancy is something of a latent variable in the success of systems. You see hundreds of blog posts on benchmarking infrastructure systems—showing millions of requests per second on vast clusters—but far fewer about the work of scaling a system to hundreds or thousands of engineers and use cases. It’s just a lot harder to quantify multi-tenancy than it is to quantify scalability.
    • Jay Kreps: the advantage of Kafka is not just that it can handle that large application but that you can continue to deploy more and more apps to the same cluster as your adoption grows, without needing a siloed cluster for each use. 
    • @vambenepe: My secret superpower is using “reply” in situations where most others would use “reply all”.
    • @tvanfosson: Developer progression: instead of junior to senior 1. Simple and wrong 2. Complicated and wrong 3. Complicated and right 4. Simple and right
    • Maria Konnikova: The real confidence game feeds on the desire for magic, exploiting our endless taste for an existence that is more extraordinary and somehow more meaningful.
    • gpderetta: Apple A9 is a quite sophisticate CPU, there is no reason to believe is not using a state of the art predictor. The Samsung CPU might not have any advantage at all on this area.
    • Chetan Sharma: For 4G, we went from 0% to 25% penetration in 60 months, 25-50% in 21 months, 50-75% in 24 months and by the end of 2020, we will have 95%+ penetration. By 2020, US is likely to be 4 years ahead of Europe and 3 years ahead of China in LTE penetration. In fact, the industry vastly underestimated the growth of 4G in the US market. Will 5G growth curves be any different?

  • You know what's cool? A rubberband powered refrigerator. Or trillions of dollars...in space mining. Space Mining Company Plans to Launch Asteroid-Surveying Spacecraft by 2020. Billionaires get your rockets ready. It's a start: Weighing about 110 pounds, Prospector-1 will be powered by water, expelling superheated vapor to generate thrust. Since water will be the first resource mined from asteroids, this water propulsion system will allow future spacecraft–the ones that do the actual mining–to refuel on the go.

  • False positives in the new fully automated algorithmic driven world are red in tooth and claw. We may need a law. You know that feeling when you use your credit and you are told it is no longer valid? You are cutoff. Some algorithm has decided to isolate you from the world. At least you can call a credit card company. Have you ever tried to call a Cloud Company? Fred Trotter tells a scary story of not being able to face his accuser in Google Intrusion Detection Problem: So today our Google Cloud Account was suspended...Google threatened to shut our cloud account down in 3 days unless we did something…but made it impossible to complete that action...Google Cloud services shutdown the entire project...It is not safe to use any part of Google Cloud Services because their threat detection system has a fully automated allergic reaction to anything that has not seen before, and it is capable of taking down all of your cloud services, without limitation. 

  • In the "every car should come with a buggy whip" department we have The Absurd Fight Over Fund Documents You Probably Don't Read. $200 million would be saved if investors got their mutual fund reports online instead of on paper. You guessed it, there's a paper lobby against it. 

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: High Scalability

The Always On Architecture - Moving Beyond Legacy Disaster Recovery

High Scalability - Tue, 2016-08-23 23:42
Failover does not cut it anymore. You need an ALWAYS ON architecture with multiple data centers. -- Martin Van Ryswyk, VP of Engineering at DataStax

Failover, switching to a redundant or standby system when a component fails, has a long and checkered history as a way of dealing with failure. The reason is your failover mechanism becomes a single point of failure that often fails just when it's needed most. Having worked on a few telecom systems that used a failover strategy I know exactly how stressful failover events can be and how stupid you feel when your failover fails. If you have a double or triple fault in your system failover is exactly the time when it will happen. 

For a long time the only real trick we had for achieving fault tolerance was to have a hot, warm, or cold standby (disk, interface, card, server, router, generator, datacenter, etc.) and failover to it when there's a problem. This old style of Disaster Recovery planning is no longer adequate or necessary.

Now, thanks to cloud infrastructures, at least at a software system level, we have an alternative: an always on architecture. Google calls this a natively multihomed architecture. You can distribute data across multiple datacenters in such away that all your datacenters are always active. Each datacenter can automatically scale capacity up and down depending on what happens to other datacenters. You know, the usual sort of cloud propaganda. Robin Schumacher makes a good case here: Long live Dear CXO – When Will What Happened to Delta Happen to You?

Recent Problems With Disaster !Recovery
Categories: High Scalability

Stuff The Internet Says On Scalability For August 19th, 2016

High Scalability - Fri, 2016-08-19 15:56

Hey, it's HighScalability time:

 


Modern art? Nope. Pancreatic cancer revealed by fluorescent labeling.

 

If you like this sort of Stuff then please support me on Patreon.
  • 4: SpaceX rocket landings at sea; 32TB: 3D Vertical NAND Flash; 10x: compute power for deep learning as the best of today’s GPUs; 87%: of vehicles could go electric without any range problems; 06%: visitors that post comments on NPR; 235k: terrorism related Twitter accounts closed; 40%: AMD improvement in instructions per clock for Zen; 15%: apps are slower is summer because of humidity;

  • Quotable Quotes:
    • @netik: There is no Internet of Things. There are only many unpatched, vulnerable small computers on the Internet.
    • @Pinboard: The Programmers’ Credo: we do these things not because they are easy, but because we thought they were going to be easy
    • Aphyr: This advantage is not shared by sequential consistency, or its multi-object cousin, serializability. This much, I knew–but Herlihy & Wing go on to mention, almost offhand, that strict serializability is also nonlocal!
    • @PHP_CEO: I’VE HAD AN IDEA / WE’LL TAKE ALL THE BAD CODE / BUNDLE IT TOGETHER / AND SELL IT TO VCS AS A COLLATERALIZED TECHNICAL DEBT OBLIGATION
    • felixgallo: I agree, the actor model is a significantly more usable metaphor for containers than functions. When you start thinking about supervisor trees, you start heading towards Kubernetes, which is interesting.
    • David Rosenthal: So in practice blockchains are decentralized (not), anonymous (not and not), immutable (not), secure (not), fast (not) and cheap (not). What's (not) to like?
    • @grimmelm: You know, you can’t spell “idiotic” without “IoT”
    • @jroper: 10 years ago, backends were monolithic services and frontends many pages. Now frontends are monolithic pages and backends many services.
    • @jakevoytko: Ordinary human: Hey, this is a fork. You can eat with it! People who comment on programming blogs: You can't eat soup with that.
    • iLoch: Wow $5000/mo for 2000rps, just for the application servers? That's absurd. I think we're paying around $2000/mo for our app servers, a database which is over 2TB in size, and we ingest about 10 megabytes of text data per second, on top of a couple thousand requests per second to the user facing application.
    • @josh_wills: I'm thinking about writing a book on data engineering for kids: "An Immutable, Append-Only Log of Unfortunate Events"
    • Kill Process: What the world needs is not a new social network that concentrates power in a single place, but a design to intrinsically prevent the concentration of power that results in barriers to switching.
    • ljmasternoob: the bump was just Schrödinger's cat stepping on Occam's razor.
    • carsongross: The JVM is a treasure just sitting there waiting to be rediscovered.
    • @mjpt777: When @nitsanw points out some of what he finds in the JVM I often end up crying :(
    • @karpathy: I hoped TensorFlow would standardize our code but it's low level so we've diverged on layers over it: Slim, PrettyTensor, Keras, TFLearn ...
    • @rbranson:  coordination is a scaling bottleneck in teams as much as it is in distributed systems.
    • @mathiasverraes: There are only two hard problems in distributed systems:  2. Exactly-once delivery 1. Guaranteed order of messages 2. Exactly-once delivery
    • @PhilDarnowsky: I've been using dynamically typed languages for a living for a decade. As a result, I prefer statically typed languages.
    • Allyn Malventano: 64-Layer is Samsung's 4th generation of V-NAND. We've seen 48-Layer and 32-Layer, but few know that 24-Layer was a thing (but was mainly in limited enterprise parts).
    • @cmeik: "It's a bit odd to me that programming languages today only give you the ability to write something that runs on one machine..." [1/2]
    • @trengriffin: @amcafee Use of higher radio frequencies will require a lot more antennas creating ever smaller coverage areas. More heterogeneous bandwidth
    • @jamesurquhart: Disagree IaaS multicloud tools will play major role moving forward. Game is in PaaS and app deployment (containers).

  • Linking it all together on a great episode of This Week In Tech. Google’s new OS, Fuchsia, for places where Android fears to tread, smaller, lower power IoT type devices. Intel Optane is an almost shipping non-volatile memory that is 1000X faster than SSD (maybe not), has up to 10X the capacity of DRAM, while only being a few X slower than typical DRAM, is perfect for converged IoT devices. Say goodbye to blocks and memory tiers. IoT devices don't have to be fast, so DRAM can be replaced with this new memory, hopefully making simpler cheaper devices that can last a decade on a small battery, especially when combined with low power ARM CPUsNVMe is replacing SATA and AHCI for higher bandwidth, lower latency access to non-volatile memory. 5g, when it comes out, will specifically support billions of low power IoT devices. Machine learning ties everything together. That future that is full of sensors may actually happen. As Greg Ferro said~ We are starting to see the convergence of multiple advances. You can start to plot a pathway forward to see where the disruption occurs. The irony, still, is nothing will work together. We have ubiquitous wifi more from a fluke of history than any conscious design. We see how when left up to industry the silo mindset captures all reason, and we are all the poorer for it.

  • We have water rights. Mineral rights. Surface rights. Is there such a thing as virtual property rights? Do you own the virtual property rights of your own property when someone else decides to use it in an application? Pokemon GO Hit With Class Action LawsuitWhy do people keep coming to this couple’s home looking for lost phones?

  • As data becomes more valuable that we are the product becomes assumed. Provider of Personal Finance Tools Tracks Bank Cards, Sells Data to Investors: Yodlee has another way of making money: The company sells some of the data it gathers from credit- and debit-card transactions to investors and research firms...Yodlee can tell you down to the day how much the water bill was across 25,000 citizens of San Francisco” or the daily spending at McDonald’s throughout the country...The details are so valuable that some investment firms have paid more than $2 million apiece for an annual subscription to Yodlee’s service.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: High Scalability

Sponsored Post: Zohocorp, Exoscale, Host Color, Cassandra Summit, Scalyr, Gusto, LaunchDarkly, Aerospike, VividCortex, MemSQL, AiScaler, InMemory.Net

High Scalability - Tue, 2016-08-16 15:56

Who's Hiring?
  • IT Security Engineering. At Gusto we are on a mission to create a world where work empowers a better life. As Gusto's IT Security Engineer you'll shape the future of IT security and compliance. We're looking for a strong IT technical lead to manage security audits and write and implement controls. You'll also focus on our employee, network, and endpoint posture. As Gusto's first IT Security Engineer, you will be able to build the security organization with direct impact to protecting PII and ePHI. Read more and apply here.

Fun and Informative Events
  • High-Scalability Database Beer Bash. Come join Aerospike and like-minded peers on Wednesday, September 7 from 6:30-8:30 PM in San Jose, CA for an informal meet-up of great food and libations. You'll have the chance to learn about Aerospike's high-performance NoSQL database for mission-critical applications, and about the use cases of the companies switching to Aerospike from first-generation NoSQL databases such as Cassandra and Redis. Feel free to invite colleagues and peers! RSVP: bit.ly/DBbeer

  • Join database experts from companies like Apple, ING, Instagram, Netflix, and many more to hear about how Apache Cassandra changes how they build, deploy, and scale at Cassandra Summit 2016. This September in San Jose, California is your chance to network, get certified, and trained on the leading NoSQL, distributed database with an exclusive 20% off with  promo code - Academy20. Learn more at CassandraSummit.org
Cool Products and Services
  • Do you want a simpler public cloud provider but you still want to put real workloads into production? Exoscale gives you VMs with proper firewalling, DNS, S3-compatible storage, plus a simple UI and straightforward API. With datacenters in Switzerland, you also benefit from strict Swiss privacy laws. From just €5/$6 per month, try us free now.

  • High Availability Cloud Servers in Europe: High Availability (HA) is very important on the Cloud. It ensures business continuity and reduces application downtime. High Availability is a standard service on the European Cloud infrastructure of Host Color, active by default for all cloud servers, at no additional cost. It provides uniform, cost-effective failover protection against any outage caused by a hardware or an Operating System (OS) failure. The company uses VMware Cloud computing technology to create Public, Private & Hybrid Cloud servers. See Cloud service at Host Color Europe.

  • Dev teams are using LaunchDarkly’s Feature Flags as a Service to get unprecedented control over feature launches. LaunchDarkly allows you to cleanly separate code deployment from rollout. We make it super easy to enable functionality for whoever you want, whenever you want. See how it works.

  • Scalyr is a lightning-fast log management and operational data platform.  It's a tool (actually, multiple tools) that your entire team will love.  Get visibility into your production issues without juggling multiple tabs and different services -- all of your logs, server metrics and alerts are in your browser and at your fingertips. .  Loved and used by teams at Codecademy, ReturnPath, Grab, and InsideSales. Learn more today or see why Scalyr is a great alternative to Splunk.

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

  • VividCortex measures your database servers’ work (queries), not just global counters. If you’re not monitoring query performance at a deep level, you’re missing opportunities to boost availability, turbocharge performance, ship better code faster, and ultimately delight more customers. VividCortex is a next-generation SaaS platform that helps you find and eliminate database performance problems at scale.

  • MemSQL provides a distributed in-memory database for high value data. It's designed to handle extreme data ingest and store the data for real-time, streaming and historical analysis using SQL. MemSQL also cost effectively supports both application and ad-hoc queries concurrently across all data. Start a free 30 day trial here: http://www.memsql.com/

  • aiScaler, aiProtect, aiMobile Application Delivery Controller with integrated Dynamic Site Acceleration, Denial of Service Protection and Mobile Content Management. Also available on Amazon Web Services. Free instant trial, 2 hours of FREE deployment support, no sign-up required. http://aiscaler.com

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • www.site24x7.com : Monitor End User Experience from a global monitoring network.

 

If any of these items interest you there's a full description of each sponsor below...

Categories: High Scalability

How PayPal Scaled to Billions of Transactions Daily Using Just 8VMs

High Scalability - Mon, 2016-08-15 15:56

How did Paypal take a billion hits a day system that might traditionally run on a 100s of VMs and shrink it down to run on 8 VMs, stay responsive even at 90% CPU, at transaction densities Paypal has never seen before, with jobs that take 1/10th the time, while reducing costs and allowing for much better organizational growth without growing the compute infrastructure accordingly? 

PayPal moved to an Actor model based on Akka. PayPal told their story here: squbs: A New, Reactive Way for PayPal to Build Applications. They open source squbs and you can find it here: squbs on GitHub.

The stateful service model still doesn't get enough consideration when projects are choosing a way of doing things. To learn more about stateful services there's an article, Making The Case For Building Scalable Stateful Services In The Modern Era, based on an great talk given by Caitie McCaffrey. And if that doesn't convince you here's WhatsApp, who used Erlang, an Akka competitor, to achieve incredible throughput: The WhatsApp Architecture Facebook Bought For $19 Billion.

I refer to the above articles because the PayPal article is short on architectural details. It's more about the factors the led the selection of Akka and the benefits they've achieved by moving to Akka. But it's a very valuable motivating example for doing something different than the status quo. 

What's wrong with services on lots of VMs approach?

Categories: High Scalability

Stuff The Internet Says On Scalability For August 12th, 2016

High Scalability - Wed, 2016-08-10 15:20

Hey, it's HighScalability time:

 

 

The big middle finger to the Olympic Committee. They pulled this video of the incredibly beautiful Olympic cauldron at Rio.

 

If you like this sort of Stuff then please support me on Patreon.
  • 25 years ago: the first website went online; $236M: Pokemon Go revenue in 5 weeks in 3 countriesSeveral thousand: work on Apple maps; 2500 Nimitz Carriers: weight of iPhone if implemented using tube transistors; $50 trillion: cost of iPhone in 1950, economic output of the world in your hand; 1000x: faster phase-change RAM; 15lbs: Americans heavier than 20 years ago; 2 years: for hacking the IRS; 3.6PB: hypothetical storage pod based on 60 TB SSD; 330,000: cash registers hacked; 162%: increased love for electric cars in China; 

  • Quotable Quotes:
    • @carllerche: it is hard to imagine how a node app could get closer to the metal with only 20MM LOC between the app and the hardware.
    • David Heinemeier Hansson (RoR)~ Lots and lots of huge systems that are running the gosh darn Internet are built by remote people operating asynchronously. You don't think that's good enough for your little shop?
    • Cesarini: Some frameworks that try to automate activities end up failing to hide complexity. They limit the trade-offs you can make, so they cater only to a subset of systems, often with very detailed requirements. 
    • "Uncle" Bob Martin: I have lived through 22 orders of magnitude growth of growth in hardware.
    • Jovanovic: To use Bitcoin for real-time trades, we need to eliminate its lazy fork-resolution mechanism and adopt strong consistency, a more proactive approach that guarantees transaction persistence.
    • Pedro Ramalhete: one latency distribution plot is worth a thousand throughput measurements
    • @n1ko_w1ll: Impressive numbers:  - 80% cut code with #scala - responsive at 90% load with #akka Impressive numbers: - 80% cut code with #scala- responsive at 90% load with #akka
    • @samkroon: So Aussie government is asking 20 million ppl to login to one web site on the same night... Fail. Should have gone #serverless. #census2016
    • @caitie: "My contribution to RPC is not to make another system based on RPC" @cmeik #NikeTechTalks
    • @krisajenkins: This is your return type: Int / This is your return type on microservices: IO / (Logger (Either HttpError Int)) Microservices: Know the risks.
    • @nosqlonsql: Latency drives throughput if you cannot achieve enough concurrency. Kafka vs Chronicle. Must read by @PeterLawrey
    • reddit: Today's date is 100/1000/10000 in binary
    • @caitie: "The languages we associate with distributed programming are really concurrent languages" @cmeik #NikeTechTalks
    • @goserverless: Lambda down :( #aws #serverless
    • @pkanavos: @goserverless I think I'll PaaS
    • Jan Wedel: So if you plan to build an application from scratch and it is only meant to be used in on-premise scenarios as described, you probably shouldn't go for a microservice architecture.
    • @bmoesta: Any industry that solely focuses on efficiency innovation is on the verge of death. Disruptive innovations that drive progress drive growth
    • flak: It’s quite likely that your crypto will explode sooner or later, and it’s possible that random numbers will be implicated, but it’s very unlikely that some USB gizmo promising “true random” at kilobits per second will save you. Save your money instead.

  • Imagine how much the world has changed in those 25 years. The world's first website went online 25 years ago today. Without the Web the Internet would probably still be a backwater for researchers. The Web was the Internet's killer app. It's hard to imagine Pokemon is Augmented Realities' killer app. AR needs its let the people make it bigger and better technology. Given the balkanization of AR into proprietary silos AR may never have its Web moment. Will there be an HTTP for AR?

  • The phrase "small, reprogrammable quantum computer" doesn't sound remotely present-tense, but it is: Shantanu Debnath and colleagues at the University of Maryland reveal their new device can solve three algorithms using quantum effects to perform calculations in a single step, where a normal computer would require several operations. Although the new device consists of just five bits of quantum information (qubits), the team said it had the potential to be scaled up to a larger computer...the key to the new device was a system of laser pulses that drove the quantum logic gates, which operate like the switches and transistors that power ordinary computers.

  • Turning programmers into a proper profession, like doctors, is not the way to go. How much do doctors innovate? Very little. Doctors as a profession have been pounded into their current shape by two oppressors: fear of lawsuits and educational debt. Doctors are bound by best practices and oaths to do nothing interesting. What must programmers do constantly? Innovate and do the interesting. By not being a profession we are free to do harm, yes, but we are also able to create. Creation is a better failure mode than ossification. "Uncle" Bob Martin - "The Future of Programming". Nice gloss by Eric Fleming: Long story short this was really two talks in one. The first speech was about progress in hardware and software from 1945 to 2015. The second talk is about how there is so much growth in the programming field that there are too many young inexperienced people to do it right which necessitates some self regulatory body to bring young professionals into the flock. Ironically the talk his didn't intend to give, the first one is far more interesting than the talk he did give about how to fix the growing inexperience in industry.

  • Don't let what happened in Turkey happen to your coup attempt. Learn from experience. Here's your step-by-step guide on How to Overthrow a Government. Presented at, you may be surprised to hear, DefCon. First select from a menu of three overthrow methods: regime change: elections, coups and revolution. Next select a crack insurgency team from a handy wizard interface. Then there's a drop down list of intelligence gathering resources and funding options. After a few more clicks just press Go and you have your revolution (you'll certainly choose revolution, you get so many more points that way).

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: High Scalability
Syndicate content